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FOREWORD 


The  Harry  Diamond  Laboratories  located  in  Melphi,  Maryland  served 
as  the  site  for  the  22nd  Conference  on  the  Design  of  Experiments  in  Army 
Research,  Development,  and  Testing  held  20-22  October,  lSff6.  This  Army 
agency  co-hosted  the  first  tViree  conferences  in  this  series  with  the 
National  Bureau  of  Standards  when  it  wan  located  there.  It  was  a pleasure 
to  meet  in  the  new  quarters  of  the  Harry  Diamond  Laboratories  and  take 
advantage  of  their  excellent  facilities.  Planning  for  these  meetings 
requires  much  effort  and  attention  to  detail  and  we  are  indebted  to 
Dr.  Joseph  Kirschner  vAio  served  as  Chairman  for  Local  Arrangements  and  was 
ably  assisted  by  Grace  Frazier  and  Stoyen  Kimmel.  VJe  are  pleased  that 
Colonel  Thomas  McGregor,  Commanding  OtHcer  of  the  Laboratories  opened 
the  Conference  and  welcomed  the  participants.  We  look  forward  to  meeting 
at  the  Laboratories  again  in  the  future. 

It  d.s  traditional  to  have  invited  speakers  give  essentially  expository 
talks  on  topics  o'!'  current  interest  in  statistics  and  probability.  Tliere 
is  also  an  attempt  to  provide  talks  that  are  somewhat  consistent  with  the 
theme  of  the  mission  of  the  Army  installation  at  which  the  annual  Conference 
is  held.  This  confluence  of  purposes  was  achieved.  The  first  talk  was 
given  by  Professor  J.  Stuart  Hunter  of  Princeton  University  on  "Hie  Measurement 
Process."  The  crux  of  this  talk  was  measurement  when  data  Is  available  over 
time  such  as  in  air  pollution  studies  and  the  speaker  presented  tv?o  different 
models  by  ^Aiich  this  could  he  accomplished.  Later  in  the  first  morning 
Professor  Benjamin  S.  Blanchard  of  Virginia  Polytechnic  Institute  and 
State  University  gave  a talk  on,  "Management  of  Reliability."  The 
reliability  theme  pervades  many  Array  Installations  and  this  is  so  at  the 
Harry  Diamond  Laboratories.  On  the  afternoon  of  the  second  day  there 
were  two  sessions  for  invited  speakers  and  each  was  devoted  to  a very 
current  topic  in  statistics  where  each  topic  has  a fast  developing 
literature,  The  first  speaker  was  Dr.  Carl  N.  Morris  of  the  RAND  Corpora- 
tion who  spoke  on,  "Stein's  Estimator,  Its  Generalizations  and  Its 
Applications."  This  was  followed  by  Professor  Robert  Hogg  of  the 
University  of  Iowa  who  spoke  on,  "Robust  Gtatlstical  Procedures."  The 
subject  matter  in  both  of  these  talks  has  wide  ranging  applj  cations 
in  a number  of  diverse  activities  of  the  Army.  On  the  morning  of  the 
last  day  of  the  meetings  Professor  Nozer  D.  5ingpurv;alla  of  the  George 
Washington  University  spoke  on,  "Accelerated  Life  Testing."  This  topic 
has  a long  histoiy  in  Defense  Department  progrwns  and  is  still  a quite 
active  subject  for  statistical  investigations. 

The  audience  consisted  of  a largo  number  of  participants  from  Army 
installations,  other  government  agencies,  and  a number  of  investigators 
from  universities.  A major  purjipse  of  the  conference  is  to  bring 
together  those  engaged  in  scientific  work  in  Aivny  InstaJ.lationa  \dth  other 
Investigators,  This  interaction  has  been  going  on  cucceesAilly  since  the 
inception  of  the  program  and  it  continued  at  this  Conference.  Statisticians 
and  others  in  Army  installations  dlccuoo  their  work  at  techni'al  seccionc 
and  clinical  sessions  at  each  Anjiu.‘il  Conference.  For  this  Conference 


there  were  eight  technical  eeselons  ccraprising  eighteen  papers  and  four 
clinieal  sessions.  At  the  clinical  sessions  a panel  of  experts 
responds  to  problems  raised  by  those  in  Army  installations  who  have 
usually  given  advance  manuscript  copies  to  the  panelists.  Besides  the 
technical  aspects^  these  sessions  provide  a source  for  Initiating  future 
collaboration  between  scientists  in  Army  Installations  and  those  in 
university  life. 

On  the  evening  of  the  first  day  of  the  Conference  a banquet  is  held 
at  which  the  Samuel  S.  Wilks  Memorial  Award  of  the  American  Statistical 
Association  and  the  Department  of  the  Amy  is  presented.  At  this  meeting 
the  twelfth  award  was  presented  to  Dr.  Solomon  Kullback,  Profesor  Emeritus 
of  Statistics  at  the  George  Washington  University.  The  award  was  made  by 
Dr.  Joan  Rosenblatt,  Chairman  of  the  Wilks  Award  Committee.  Professor 
Kullback  was  cited  for  substantive  aontributlons  to  both  the  theory  and 
the  application  of  statistics,  including  his  work  on  multidimensional 
contingency  table  analysis  end  cryptanalysis,  and  his  outstanding  contri- 
butions in  the  application  of  statistics  in  the  service  of  the  Nation. 

Bie  Aimy  Mathematics  Steering  Committee  sponsors  these  meetings 
on  behalf  of  the  Office  of  the  Chief  of  Research  and  Development  and 
Acquisition  to  bring  new  developments  in  statistics  to  Army  scientists 
and  engineers  and  to  expose  them  to  thinking  that  could  be  profitable 
to  them  in  the  execution  of  their  missions.  Ihe  Committee  has  asked 
that  the  Proceedings  of  the  Conference  be  published  and  issued  Army 
wide  and  to  other  scientific  communities. 


At  the  beginning  of  each  calander  year  the  Program  Committee  for 
the£.e  conferences  is  selected  and  meets  in  Washington,  D.C.  to  suggest 
areas  of  interest,  to  outline  a program,  and  to  suggest  speakers  for 
the  meeting  to  be  held  later  that  year.  I would  like  to  express  my 
appreciation  to  Dr.  Frank  Grubbs,  Program  Chairman  for  this  year's 
Committee  and  to  Dr.  Douglas  Tang,  Chairman  of  the  Subcommittee  on 
Probability  and  Statistics,  Army  Mathematics  Steering  Committee,  for 
their  efforts  and  great  help.  % thanks  also  go  to  other  committee 
members  involved  in  developing  this  year's  programs  Drs.  Walter  D, 
Foster,  Bernard  Harris,  Joseph  M.  Kirschner,  Badrig  Kurkjian,  Clifford 
J.  Maloney,  Robert  J,  Launer,  Douglas  B.  Tang.  Dr.  Francis  0.  Dressel, 
Program  Committee  Secretary,  as  always  was  helpful  in  many  ways  in  making 
sure  the  program  was  a success.  Thus,  many  helped  in  guiding  this 
Conference  to  a successful  conclusion  and  this  is  veiy  much  appreciated. 


Herbert  Solomon 
Conference  Chairman 
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AGENDA 

THE  TWENTY-SECOND  CONFERENCE  ON  THE  DESIGN  OF  EXPERIMENTS  IN 
ARMY  RESEARCH,  DEVELOPMENT  AND  TESTING 

20-22  October  1976 
Harry  Diamond  Laboratories 
*****  Wednesday,  20  October  ***** 

091S-0915  Registration  — Lobby  of  the  Administration  Building:  Building  20S 
0915»121S  GENERAL  SESSION  I — Auditorium  of  the  Administration  Building 
CALLING  OF  CONFERENCE  TO  ORDER 

Mr.  Joseph  Kirshner,  Chairman  on  Local  Arrangements,  Harry 
Diamond  Laboratories,  Adelphi,  Maryland 

WELCOMING  REMARKS 

Colonel  Thomas  McGregor,  Commanding  Officer, 

Harry  Oiamoitd  Laboratories,  Adelphi,  Maryland 

CHAIRMAN  OF  SESSION  1 

Dr.  Frank  E.  Grubbs,  Program  Comnittee  Chairman,  Aberdeen 
Proving  Ground,  Maryland 

THE  MEASUREMENT  PROCESS 

Professor  J.  Stuart  Hunter,  School  of  Engineering  and  Applied 
Science,  Princeton  University,  Princeton,  New  Jersey 

1030-1100  BREAK 

1100-1215  GENERAL  SESSION  I (CONTINUED) 

MANAGEMENT  OF  RELIABILITY 

Professor  Benjamin  S.  Blanchard,  Jr.,  Engineering  Extension 
Division,  Virginia  Polytechnic  Institute  and  State  University, 
Blacksburg,  Virginia 

1215-1315  LUNCH  - HDL  Cafeteria 
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Wednesday  ***** 

1315*1445  CLINICAL  SESSION*  A — Auditorium  of  Building  of  205 
CHAIRMAN 

Robert  L.  Launer,  US  Army  Research  Office,  Research  Triangle 
Rark,  North  Carolina 

PANELISTS 

Seymour  Gelsser,  School  of  Statistics,  University  of  Minnesota, 
Minneapolis,  Minnesota 

Robert  V.  Hogg,  The  University  of  Iowa,  Department  of  Statistics, 
Iowa  City,  Iowa 

J.  Stuart  Hunter,  School  of  Engineering  and  Applied  Science, 
Princeton  University,  Princeton,  New  Jersey 

Herbert  Solomon,  Department  of  Statistics,  Stanford  University, 
Stanford,  California 

PROBLEMS  IN  TESTING  PHARMACOKINETIC  MODELS 

LTC  Carl  C.  Peck  and  L.  A.  Hopkins,  Blood  Research  Division, 
Department  of  Surgery,  Letterman  Army  Institute  of  Research, 

Presidio  of  San  Francisco,  California 

DIETARY  BRAN  AND  CELLULOSE:  EFFECTS  ON  SERUM  LIPIDS 

Walter  D.  Foster,  Charlotte  M.  Heggi,  Daniel  H.  Conner,  Armed 
Forces  Institute  of  Pathologyi  Frank  A.  Franklin,  Jr.,  Walter 
Reed  Army  Medical  Centers  Samuel  M.  Wylde,  Ener-G-Foods , Inc.; 

Joe  M.  Blumberg,  Oscar  B.  Hunter  Memorial  Laboratory,  Washington,  DC 

1315-1446  TECHNICAL  SESSION  1 - Room  2Q016 

CHAIRMAN 

Langhorne  P.  Withers,  US  Army  Operational  Test  and  Evaluation 
Agency,  Falls  Church,  Virginia 

ANALYSIS  OF  AN  ERROR-TIME  RESPONSE  PERFORMANCE 

Michael  Hacskaylo,  Night  Vision  Laboratory,  USA  Electronics 
Command,  Ft.  Belvoir,  Virginia 
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*****  Wednesday  ***** 

AN  EXPERIMENTAL  DESIGN  TO  DETERMINE  THE  FREQUENCY  DISTRIBUTION 
OF  LASER  RADAR  (LAOAR)  RETURN  SIGNAL  VOLTAGES 

Jerry  W.  Vickers,  Systems  Evaluation,  Aeroballlstics  Directorate, 
USA  Hisslle  R&D  Command,  Redstone  Arsenal,  Alabama 

144^-1515  BREAK 

CLINICAL  SESSION  B - Auditorium  of  Building  205 
CHAIRMAN 

.Joan  R.  Rosenblatt,  Statistical  Engineering  Laboratory,  National 
Bureau  of  Standards,  Washington,  DC 

PANELISTS 

A.  Clifford  Cohen,  Institute  of  Statistics,  University  of  Georgia, 
Athens,  Georgia 

Frank E.  Grubbs,  Aberdeen  Proving  Ground,  Maryland 

Bernard  Harris,  Mathematics  Research  Center,  University  of 
Wisconsin,  Madison,  Wisconsin 

Nozer  0.  Singpurwalla,  Department  of  Operations  Research,  George 
Washington  University,  Washington,  DC 

AELIABILITY  ANALYSIS  OF  AIRFIELD  LIGHTING  SYSTEMS 

Frank  Kuo  and  Ed  Lindow,  Construction  Engineering  Research 
Laboratory,  Champaign,  Illinois 

SIMPLIFIED  METHOD  FOR  DETERMINING  APPROXIMATE  LOWER  CONFIDENCE 
BOUNDS  OF  A SYSTEM  WHOSE  RELIABLITY  FUNCTION  IS  DESCRIBED  AS  A 
BETA 

Louis  M.  Iannu2ze1l1  and  R.  Dostal,  HQ,  USA  Armament  Command, 

Rock  Island,  Illinois 

1S1|»164S  TECHNICAL  SESSION  2 - Room  26016 
CHAIRMAN 

Gertrude  Weintraub,  Picatinny  Arsenal,  Dover,  New  Jersey 

EVALUATION  OF  GUNNER  ERRORS  THROUGH  TIME  SERIES  ANALYSIS 

Letricha  Greene  and  John  Howerton,  Systems  Evaluation, 
Aeroballlstics  Directcrate,  USA  Missile  R&D  Command, 

Redstone  Arsenal,  Alabama 
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*****  y«dncsday  ***** 

RANGE  INSTRUMENTATION  POSITION  ACCURACY 
F.  L.  Cartari  Dugway  Proving  Ground,  Dugway,  Utah 
1830  * SOCIAL  HOUR  AND  BANQUET  - Hanpshlrt  Inn 

PRESENTATION  OF  THE  SAMUEL  S.  WILKS  MEMORIAL  AWARD 
Dr.  Frank  E.  Grubbs,  Master  of  Ceremonies 

*****  Thursday,  21  October  ***** 

0830-1010  CLINICAL  SESSION  C — Auditorium  of  Building  20S 
CHAIRMAN 

A.  Clifford  Cohen,  Institute  of  Statistics,  University  of 
Georgia,  Athens.  Georgia 

PANELISTS 

Robert  Bechhofer,  Department  of  Operations  Research,  Cornell 
University,  Ithaca,  New  York 

Seimnour  Gelsser,  School  of  Statistics,  University  of  Minnesota, 
Minneapolis,  Minnesota 

Robert  V.  Hogg,  The  University  of  Iowa,  Department  of  Statistics, 
Iowa  City,  Iowa 

J.  Richard  Moore,  US  Army  Ballistic  Research  Laboratories, 
Aberdeen  Proving  Ground,  Maryland 

EXPERIMENTAL  DESIGN  FOR  LABORATORY  EVALUATION  OF  IMAGING 
SYSTEMS 

R.  Flaherty,  J.  Palmer  and  F.  Shields,  Night  Vision  Laboratory, 
USA  Electronics  Command,  Ft.  Bcivoir,  Virginia 

A METHOD  FOR  DETERMINING  PAIRWISE  CONTRASTS  FROM  A FRIEDMAN 
TWO-WAY  LAYOUT,  BASED  ON  A THEOREM  BY  MARASCUILO 

Jimmie  C.  Deloach  and  Eugene  Dutolt,  USA  Infantry  Center, 

Ft.  Banning,  Georgia 

0830-1010  TECHNICAL  SESSION  3 - Room  2Q016 
CHAIRMAN 

J.  Bart  Wilburn,  Jr.,  14M  Branch,  US  Army  Electronic  Proving 
Ground,  Ft.  Huachuca,  Arizona 
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*****  Thursday  ***** 

ESTIMATE  OF  RELIABLITY  IN  THE  STRESS-STRENGTH  MODEL 


Atit  P.  Basut  University  of  Mlssourl-Columbla,  Department 
ef  Statistics!  Columbia!  Missouri 

UNDERLYING  PROBABILITY  DISTRIBUTION  OF  GUN  TUBE  FATIGUE  LIFE 

Ronald  L.  Raelcot,  itetervllet  Arsenal*  Watervllet*  New  York 

FAILURE  PREDICTION  OF  FINITE  FLAWED  CERAMIC  PLATES  UNDER 
COMBINED  STRESSES 

Donald  H.  Nea1,  Army  Materiel  and  Mechanics  Research  Center* 
Watertown*  Massachusetts 

1010-1040  BREAK 

104Q-121S  CLINICAL  SESSION  D - Auditorium  of  Building  205 
CHAIRMAN 

Clifford  J.  Maloney*  Bureau  of  Biologies*  FDA*  Bethesda* 

Maryland 

PANELISTS 

Robert  Bechhofer,  Department  of  Operations  Research*  Cornell 
University*  Ithaca*  New  York 

6.E.P.  Box*  Department  of  Statistics*  University  of  Wisconsin, 
Madison*  Wisconsin.  Representing  the  Mathematics  Research  Center. 

Bernard  Harris.  Mathematics  Research  Center,  University  of 
Wisconsin,  Madison,  Wisconsin 

Herbert  Solomon,  Department  of  Statistics,  Stanford  University, 
Stanford*  California 

ESTIMATION  AND  EFFECT  OF  NOISE  CORRELATION  ON  VARIANCE 
ESTIMATION  FROM  MOVING  ARC  SMOOTHING 

Paul  H.  Thrasher*  Quality  Assurance  Office*  White  Sands 
Missile  Range,  New  Mexico 

1040-1215  TECHNICAL  SESSION  4 - Room  2Q016 

CHAIRMAN 

Walter  D.  Foster,  Armed  Forces  Institute  of  Pathology* 

Washington*  DC 
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*****  Thursday  ***** 

ROBUST  OUTLIER  DETECTION  IN  TRAJECTORY  DATA  REDUCTION 

Robert  H.  Turner  and  Wllllain  S.  Agee*  Analysis  and  Computation 
Division,  National  Range  Operations  Directorate,  White  Sands 
Missile  Range,  New  Mexico 

ON  THE  UPWARD  CONTINUATION  OF  FIRST  DERIVATIVES  OF  THE 
ANOMALOUS  GRAVITY  POTENTIAL  UNDER  CONSIDERATION  OF  A 
SUITABLE  DATA  BASE 

H.  Baussus  von  Luetzow,  USA  Engineer  Topographic  Laboratories, 
Ft.  Belvoir,  Virginia 

COMPARISON  OF  ERROR  RATES  AND  MISCLASSIFICATION  PROBABILITIES 
USING  BINOMIAL  AND  BAYESIAN  MODELS  FOR  PERSONNEL  CLASSIFICATION 

Frederick  H.  Steinhelser,  Jr.  and  Kenneth  I.  Epstein,  USA 
Research  Institute,  Arlington,  Virginia 

1215-131 S LUNCH  - HDL  Cafeteria 

1315-1415  TECHNICAL  SESSION  5 - Auditorium  of  Building  205 
CHAIRMAN 

Joseph  S.  Tyler,  Jr.,  Chemical  Research  Laboratory, 

Biophysics  Laboratory,  u$  Army  Edgewood  Arsenal,  Edgewood, 
Maryland 

TABLE  LOOK  UP  AND  INTERPOLATION  FOR  A NORMAL  RANDOM  NUMBER 
GENERATOR 

Will  1am  L.  Shepherd  and  John  N.  Hynes,  Systems  Management 
Division,  Instrumentation  Directorate,  White  Sands  Missile 
Range,  New  Mexico 

EIGENVECTORS  ANALYSIS  OF  EMPIRICAL  DATA  VERSUS  UTILIZATION  OF 
aTANOARD  FUNCTIONS 

Oskar  M.  Essenwanger,  Physical  Sciences  Directorate,  USA 
Missile  RD&E  Laboratory,  USA  Missile  Command,  Redstone 
Arsenal,  Alabama 

1315-1415  TECHNICAL  SESSION  6 - Room  2G016 
CHAIRMAN 

Malcolm  S.  Taylor,  US  Army  Ballistic  Research  Laboratories, 
Aberdeen  Proving  Ground,  Maryland 


*****  Thursday  ***** 
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INDUCTION  ON  A MARKOV  CHAIN 

Richard  H.  Brugger,  RAM  Assessment  Division.  USA  Armament 
Command.  Rock  Island.  Illinois 

nARKOV  DEPENDENT  PROCESSES  AND  CONTINUOUS  SAMPLINS  PLANS  IN 
TANDEM 

David  L.  Arp.  Naval  Weapons  Center.  China  Lake.  California 
1415<1645  GENERAL  SESSION  II  — Auditorium  of  Building  205 
CHAIRMAN 

Or.  Douglas  B.  Tang.  Department  of  Biostatistics/Applied 
Mathematics.  Division  of  Biometrics  and  Medical  Information 
Processing.  Walter  Reed  Army  Institute  of  Research. 

Washington.  DC 

STEIN'S  ESTIMATOR.  ITS  GENERALIZATIONS,  AND  ITS  APPLICATIONS 
Or.  Carl  N.  Morris.  Rand  Corporation,  Santa  Monica.  California 
151S-154S  BREAK 

1S4S-164S  GENERAL  SESSION  II  (CONTINUED) 

ON  ROBUST  STATISTICAL  PROCEDURES 

Professor  Robert  V.  Hogg,  The  University  of  Iowa,  Department 
of  Statistics,  Iowa  City.  Iowa 

*****  Friday,  22  October  1976  ***** 

0830*1015  TECHNICAL  SESSION  7 - Room  2G016 
CHAIRMAN 

Gerald  Andersen,  Battlefield  System  Integration  Directorate, 
Alexandria,  Virginia 
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Aggjg^.  Our  systems  and  equipment  in  the  field  have  become  more  and  more 
complex;  are  not  operationally  available  a good  percent  of  the  time;  require 
extensive  maintenance  and  support;  and  are  quite  costly.  One  of  the  causes  for 
this  dilemma  is  the  ennhasis  that  has  been  placed  on  performance  and  advanced 
technolo^,  while  at  the  same  time  very  little  consideration  has  been  given  to 
reliability,  availability,  and  maintainability  (RAM).  More  recently,  a concerted 
effort  has  been  initiated  to  recognize  RAM  as  necessary  parameters  of  system/ 
equipment  design  and  development.  Military  specifications  and  standards  have 
been  generated  and  RAM  requirements  (to  varying  degrees)  have  been  formally 
applied  on  many  programs,  Although  this  effort  has  forced  the  recognition  of 
RAM  to  a considerable  extent,  many  program  implementation  problems  currently 
exist  and  our  systems/equipment  in  the  field  are  still  experiencing  significant 
difficulties. 

In  this  paper  the  author  has  attenpted  to  identify  some  of  the  problems 
associated  with  current  RAM  implementation  practices,  and  to  recommend  some 
courses  of  action  for  inprovement  in  the  future.  A significant  challenge  lies 
ahead  if  we  intend  to  derive  some  of  the  benefits  from  RAM. 

1.  INTRODUCTICN.  Through  the  past  few  decades  enphasls  in  the  desim  and 
development  of  new  systems  and  equipment  has  been  placed  primarily  on  performance 
factors,  delivery  schedules,  and  initial  acquisition  price.  The  pressures  associ- 
ated with  increased  performance  has  resulted  in  a dilemma  where  many  items  current- 
ly in  government  inventories  are  highly  complex,  inoperative  a good  percentage 
of  the  time,  difficult  to  maintain,  and  in  general  too  costly  to  justify.  In  other 
words,  we  have  produced  a large  quantity  of  systems  and  equipment  with  low  reli- 
ability and  poor  maintainability  characteristics,  and  the  level  of  sipport  neces- 
sary to  sustain  them  operationally  is  considerable!  This  in  turn  has: 

a.  Threatened  the  overall  availability  and  operational  effectiveness  of 
systems  and  equipment  in  the  field  and  hence,  the  defense  of  our  country 
Either  directly  or  indirectly). 

b.  Caused  high  maintenance  work  loads  and  increased  logistics  support 
resource  requirements. 

c.  Increased  life  cycle  costs  for  systems/equipment  acquisition  and 
utilization,  particularly  those  costsassociated.with  system  operation 
and  support  throughout  the  life  cycle. 
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The  current  trends  of  rising  system  costs  plus  inflation,  combined  with  some 
budgetary  shifts  from  defense  to  other  public  sectors,  are  caiislng  serious 
concerns  relative  to  our  future  defense  capability. 

More  recently,  an  attenpt  has  been  initiated  to  counter  these  trends 
through  the  reco^ition  of  some  critical  **cause  and  effect"  relationships 
invol^ng  reliability,  availability,  maintainability,  performance,  logistics 
s'jqpport,  life  cycle  cost,  etc.  Experience  has  indicated  that  highly  reliable 
and  maintainable  systems/equipment  are  a means  for  inftroving  operational 
effectiveness  while  holding  1^e  line  on  life  cycle  costs.  Reliability  and 
maintainability  are  indeed  characteristics  which  are  inherent  in  system/ 
equipment  design,  and  the  extent  to  \diich  they  are  considered  has  a signifi- 
cant impact  on  logistics  si^ort  requirements  and  life  cycle  cost.  Further, 
tlie  consideration  of  reliability  and  maintainability  in  the  design  process 
must  commence  at  the  conceptual  phase  of  system  development  and  extend  through 
detailed  full-scale  engineering  development,  test  and  evaltiation,  and  pro- 
duction. In  essence,  It  has  been  recognized  by  many  that  the  conditions 
noted  below  should  be  stressed  in  the  future: 

a.  Reliability,  availability,  and  maintainability  should  be  considered 
in  the  system  design  and  development  process  on  an  equivalent  basis 
with  performance  and  other  related  factors. 

b.  Logistics  sipport  should  be  considered  in  the  design  process  and 
should  be  closely  integrated  with  reliability,  availability,  main- 
tainability, and  performance  considerations. 

c.  Life  cycle  cost  should  be  considered  as  a design  parameter 
design  to  unit  acquisition  cost,  design  to  unit  operation  and 
support  cost,  design  to  unit  life  cycle  cost). 

A primary  objective  is  to  provide  the  necessary  management  emphasis  in  all 
future  system/ equipment  acquisitions,  or  modifications  for  improvement,  to 
ensure  that  these  considerations  are  addressed  at  the  proper  level. 

With  this  objective  in  mind,  it  is  now  appropriate  to  review  current 
practices,  assess  the  pros  and  cons  of  such,  and  determine  the  steps  neces- 
sary to  further  improve  our  systems  and  equipment.  The  author  attenpts  to 
accomplish  this  in  the  paragraphs  below,  with  the  discussion  basically 
focusing  on  the  management  of  Reliability,  Availability,  and  Maintainability 
CRAM). 

2.  CyPfflJT  RAM  PRACTpS.  Although  reliability,  availability,  and 
nmintainability  are  recognized  in  many  programs  today,  the  inplementation 
practices  associated  with  these  areas  still  require  some  inprovement,  A few 
characteristic  problems  as  they  currently  exist  in  on-going  programs  are 
outlined  below  (not  necessarily  presented  in  any  specific  order). 

a.  Specification  of  System  Technical  Requirements 

(1)  In  many  instances,  quantitative  factors  are  included  in  requests 
for  proposals  (RFPs)  and  in  contracts  as  "goals".  Consequently, 
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these  factors  are  indeed  treated  as  goals  and  not  as  requirements . 
and  are  considered  only  lightly  (if  at  all)  in  program  iirplementat ion . 

(2)  Quantitative  factors  are  not  always  specified  in  meaningful  terms. 
Often,  probabilistic  values  that  can  not  be  realistically  demonstrated 
are  specified  Instead  of  quantitative  factors  that  can  be  understood, 
allocated,  and  verified.  For  instance,  it  is  questionable  that  one 
can  adequately  verify  a 0.9995  reliability  requirement  when  limited 
quantities  of  equipment  are  procured  and  the  test  sample  is  small. 
Also,  it  is  hard  to  explain  a "0.9995  factor"  to  a design  engineer 

in  a meaningful  manner,  where  a MTBF  or  MTEM  value  would  be  more 
appropriate.  In  essence,  the  mathematical  "jargon"  sometimes  employed 
is  difficult  to  relate  directly  to  design  and  is  often  misunderstood 
by  engineers  and  management  personnel  alike. 

(3)  The  application  of  technical  requirements  is  not  always  related  to 
specific  mission  objectives.  As  a result,  it  is  difficult  to  deter- 
mine whether  the  requirements  are  too  stringent  or  too  loose  relative 
to  the  ultimate  mission  need.  In  many  cases  mission  requirements  are 
not  adequately  defined  early  enough  in  the  program,  and  one  can  not 
properly  design  equipment  without  a mission  profile  or  scenario  of 
some  type;  thus,  RAM  requirements  result  from  a "best- guess"  approach 
which  is  less  than  satisfactory. 

b.  RAM  as  Design  Parameters 

Reliability,  availability,  and  maintainability  are  not  recognized  as 
design  parameters.  Past  practices  have  promoted  the  concept  of  "design 
the  systein^lckly,  put  it  into  a test  program,  and  fix  it  if  necessary", 
RAM  have  not  been  truly  coupled  into  the  design  effort,  but  designated 
for  measurement  or  demonstration  at  the  conclusion  of  full-scale  engineer- 
ing development.  This  concept  has  been  quite  costly,  particularly  when 
extensive  system/ equipment  modifications  are  necessary  to  meet  RAM  require- 
ments at  this  late  stage  of  engineering  development. 

c.  Application  of  Specifications  and  Standards 

(1)  On  a nianber  of  occasions  the  "panic"  to  release  specifications  for 
a procurement  results  in  a fragmented  document  incorporating  con- 
flicts and  contradictions.  RAM  is  not  properly  integrated  into  the 
overall  product.  The  specification  is  without  doubt  me  of  the  most 
important  aspects  of  a program,  but  is  not  always  given  the  neces- 
sary level  of  attention  because  of  the  tight  schedule  requirement 
to  publish  something  for  immediate  dissemination  purposes.  The 
consequences  frequently  result  in  problems  occurring  throughout  the 
remainder  of  the  program. 

(^)  Military  specifications  and  standards  (e.g,,  MIL-STDs-470,- 471,  ^ai, 
-785)  are  often  arbitrarily  applied  to  a contract  in  terms  of  "blanket 
coverage"  without  the  tailoring  of  such  to  the  specific  program  need. 
This  can  result  in  the  application  of  meaningless  requirements,  un- 
timely activities  and  lnfor?nation,  too  much  data  of  little  value, 


3 


and  high  consequential  program  costs.  Specifications  and  standards 
should  address  real  time  task  enforcement,  product  measurement  and 
control,  with  less  overall  dependence  on  test  and  demonstration  at 
the  end  of  full-scale  engineering  development. 

Structuring  of  Test  Programs 


Cl)  System/equipment  testing  is  acconplished  to  different  environmental 
profiles  than  what  is  actually  e^qperienced  in  the  field.  Hence,  the 
test  results  are  not  necessarily  a verification  that  the  intended 
requirements  have  (or  have  not)  been  met.  This  relates  to  the 
initial  inadequate  definition  of  mission  profiles  or  scenarios  as 
discussed  in  Paragraph  2aC3)  above. 
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(2)  Many  test  and  demonstration  progr‘ams  are  accomplished  at  the  end 
of  full-scale  engineering  development  after  the  commitment  of 
production/ operation  funds.  Testing  at  this  stage  can  only  measure 
the  worth  of  a design  configuration  at  a point  when  it  is  too  costly 
and  time  consuming  to  make  major  changes  to  correct  a problem  for 
RAM. 

a.  Producer  Accountability 

Producer  accountability  is  generally  lacking!  If  the  system/equipment  fails 
in  test  and  demonstration,  the  policy  in  some  cases  has  been  to  discount  the 
failures  or  to  change  the  standards  stch  that  the  system  will  pass.  How 
often  is  the  system/ equipment  actually  rejected  by  the  customer  because  of 
failure  to  pass  RAM  tests?  In  such  cases,  is  the  producer  actually  required 
to  initiate  the  necessary  corrective  action  at  his  own  expense?  Perhaps 
there  are  some  cases  where  the  producer  J.s  actually  held  accountable  for  his 
design  for  RAM;  however,  in  numerous  other  instances  the  system/equipment 
is  accepted  regardless  of  the  outcome  of  RAM  verification  testing. 


The  problems  outlined  above  are  representative  of  areas  where  current 
implementation  practices  concerning  RAM  need  inprovement.  In  all  cases  the  type 
of  problems  indicated  have  been  recognised,  and  some  action  is  being  taken  (to 
varying  degrees)  in  an  attempt  to  improve  future  system/equipment  acquisitions 
from  the  RAM  standpoint.  However,  inspite  of  what  is  currently  underway  relative 
to  RAM  activities,  a great  deal  of  additional  effort  is  required  if  the  objectives 
of  RAM  in  system/equipment  design  and  development  are  to  be  truly  realized. 

3.  QiALLENGES  FOR  THE  FUTURE.  At  this  point  the  major  question  is- -Are  We 
SeriousTCbout  Reliabi luy . Avai labil ity . And  Maintainability?  The  author  TTrmly 
Believes  'that  we  are!  Rowever.  every  effort  must  be  made  to  preclude  or  alleviate 
some  of  the  problem  areas  mentioned  above.  It  is  felt  that  no  new  policies  per 
se  are  necessary,  but  that  a new  approach  to  policy  Implementation  Is  definitely 
required.  Some  key  implementation  factors  and  challenges  for  the  future  are  noted. 


I I a.  More  front  end  planning,  programming,  and  budgeting  is  required  relative  to 
. ' the  inclusion  of  RAM  factors.  In  other  words,  RAM  considerations  must  be 

' addressed  in  Decision  Coordinating  Papers  (DCFs),  Operational  Capability 

' Objective  (OCOs),  letters  Of  Agreement  QLOAs),  Outline  Development  Plans 

! (PDFs),  Required  Operational  Capability  (ROC)  documentation,  and  so  on, 

I 

i 
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Referring  to  Figure  1,  which  illustrates  the  classical  program  phases,  RAM 
should  be  initially  covered  in  the  conceptual  phase  of  system  design  and 
development,  The  intent  is  not  to  be  overly  stringent  relative  to  the 
specification  of  RAM  requirements  at  this  stage,  but  to  properly  address 
the  major  Issues  pertaining  to  RAM,  In  addition,  program  budgets  must 
reflect  RAM  deci3lons"-i,e. , increase  the  procurement  dollars  slightly  to 
acquire  the  necessary  RAM  and  reduce  the  support  dollars  to  reflect  the 
corresponding  reduction  in  system  support  cost. 

b.  RAM  must  bo  treated  more  as  a "discipline"  throughout  the  system/ equipment 
life  cycle  and  in  particular,  the  early  design  process, 

Figure  2 illustrates  the  system  life  cycle  process  and  addresses  typical 
RAM  activities  in  each  phase  of  the  process.  Not  only  are  RAM  activities 
applicable  in  each  major  evolutionary  stage  of  system  development,  but 
these  activities  must  be  closely  interrelated  throughout!  Of  particular 
significance  are  the  decisions  pertaining  to  RAM  which  are  a part  of  the 
reaulrements  depicted  in  Blocks  2 through  8 of  Figure  2.  Experience  has 
indicated  that  ultimate  system  life  cycle  cost  is  significantly  influenced 
by  design  decisions  made  during  ti\e  conceptual  and  validation  phases  of  a 
program.  The  overall  inpact  of  actions  affecting  life  cycle  cost  is  re- 
flected by  the  "trend"  curve  in  Figure  3.  Further,  experience  has  verified 
that  life  cycle  cost  is  highly  influenced  by  RAM,  particularly  those  costs 
associated  with  system  operation  and  support,  1ms,  RAM  must  be  addressed 
early  in  the  system  life  cycle  if  the  end  product  is  to  be  cost  effective, 

c.  Program  management  for  RAM  must  be  significantly  strengthened!  More 
specifically: 

(1)  Realistic  and  meaningful  requirements  must  be  clearly  specified 
early  in  the  system  life  cycle. 

Specifications  must  be  iiroroved  and  more  precisely  "tailored"  to 
meet  the  actual  needs.  User  involvement  In  the  initial  preparation 
of  specifications  is  recommended, 

(5  Requests  for  proposals  QlFPs)  must  leave  no  doulit  that  RAM  and  per- 
formance are  "equals"  in  priority  and  importance, 

(4)  Program  managers  must  be  held  technically  accountable  for  RAM  as 
well  as  for  other  factors. 

(5)  Program  "checks  and  balances"  must  be  provided  for  management 
control  Co^d  audit  for  conpllance)  relative  to  RAM  requirements. 

Formal  program  reviews  and  technical  design  reviews  must  address 
major  r5\M  issues. 

Cb)  Integrated  test  planning  is  required.  As  the  purpose  of  testing 
is  to  ensure  that  the  system/equipment  design  meets  all  stated 
requirements  (including  RAM),  it  is  essential  that  su^  testing  be 
accomplished  in  the  proper  environment,  If  the  test  conditions 
duplicate  or  exceed  the  ultimate  field  environment,  the  test  results 
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will  be  effective  in  ensuring  that  RAM  requirements  will  be  met  after 
system/equipment  deployment.  If  not,  testing  will  be  ineffective. 

Additionally,  a number  of  different  tests  may  be  accomplished  at 
different  stages  in  the  life  cycle.  All  Individual  tests  must  be 
addressed  on  an  integrated  basis  to  ensure  that  the  desired  information 
is  provided  at  the  rl^t  t'tme  in  the  system  life  cycle.  Too  much 
testing  too  early  is  costly;  accomplishing  tests  too  late  in  the 
program  could  be  costly;  and  re^dancies  in  testing  may  also  be  costly. 

C7)  More  producer  involvement  after  the  system  is  in  operatiem  is  desir- 
able. In  many  Instances,  the  producer  should  be  held  responsible  for 
correcting  major  field  deficiencies. 

($  There  should  be  more  Innovative  approaches  to  better  contracting 

for  RAM.  One  should  consider  the  appropriate  use  of:  penalty/incentive 
provisions;  penalty  clauses  to  cover  poor  workmanship  and  design  prac- 
tices; warranties  at  the  piece  part  level;  and  meaningful  progress  pay- 
ment schedules.  The  application  of  the  awroprlate  contractual  provi- 
sions for  RAM  requirements  should  create  the  desired  emphasis  relative 
to  RAM. 

(1^  Strict  and  timely  enforcement  of  RAM  program  requirements  is  essential. 


d.  Managers  and  organizations  must  be  educated  relative  to  the  benefits  derived 
through  the  proper  level  and  application  of  RAM.  This  is  perhaps  the  great- 
est challenge,  since  it  is  felt  that  many  of  the  problems  esqperlenced  ut  the 
past  could  have  been  avoided  had  the  benefits  of  RAM  been  adequately  under- 
stood and  accepted.  In  addition,  with  the  proper  education  and  understand- 
ing, many  of  the  desired  objectives  mentioned  above  should  be  readily  attain- 
able. 

4.  ■ _CgNCLUSia^.  The  past  few  decades  have  led  to  many  advances  toward 
focusing  attention  on  reliability,  availability,  and  maintainability  CRAM).  The 
next  decade  is  significant  in  terms  of  actual  realization  of  the  benefits  derived 
through  RAM.  llie  proper  levels  and  applications  of  RAM  are  indeed  necessary  to 
Improve  overall  system/cost  effectiveness  at  reduced  life  cycle  cost.  Address- 
ing the  issues  outlined  in  Paragraph  3 is  believed  to  be  a step  in  the  right 
direction  and  constitutes  a major  challenge  for  the  future,  mth  educational  kiow 
how,  persistence,  and  dedicated  effort,  It  is  believed  that  this  challenge  can 
be  met. 
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ABSTRACT.  Analyaing  drug  dlapoaltlon  data  uaing  pharmaooklnatlo 
modaling ' taahnlquaa  la  a commonly  uaad  approach  to  raduolng  auoh  data 
to  tharapautioally  uaaful  faota.  Mowavart  car tain  oonoaptual  and 
atatlatical  problama  arlaa  aa  a raault  of  tha  data  analyat'a  choloa 
of  (1)  objactlvaa  of  tha  analyala»  (2)  tha  claaa  of  modala  to  fit  tha 
data*  (3)  tha  data  fitting  proeadura,  (4)  tha  cachnlqua(a)  for 
aaaaaalng  goodnaaa  of  fltt  and  (3)  ultimatalyi  tha  moat  accaptabla 
modal.  Thaaa  problama  ara  introduead  hara  along  with  aoma  currant 
tachnlquaa  for  ovarcomlng  tham.  Tha  advlea  of  tha  panallata  la 
praaantad  along  with  our  conaidaratlon  of  thalr  rar.ommandatlona . 

1.  INTRODUCTION.  Doalng  daclalona  In  madloal  tharapautlca  oftan 
Involva  dacldlng  how  much,  how  fracuantlv.  and  how  Iona  to  admlnlatar 
a glvan  drug  to  a particular  patiant.  Such  daclalona  ara  randarad 
much  laaa  arbitrary  If  tha  tharaplat  haa  aoma  notion  of  tha  tlma 
couraa  of  drug  distribution  and  aliminatlon  from  tha  body,  aa  wall  aa 
a knowladga  of  tha  ralatlonahlp  of  thaaa  quantltatlva  faaturaa  of  drug 
dlapoaltlon  to  pharmacologic  affacta.  Surprlalng  aa  It  may  seem, 
exacting  knowladga  of  this  sort  is  known  for  only  a small  proportion 
of  Bubstanoas  currently  used  In  medical  therapeutics.  In  tha  main, 
dosing  regimens  have  bean  developed  on  an  empirical  basis  by  a trial 
and  error  process. 


1 

ii 
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Note.  Tha  presentation  of  this  paper  at  tha  Conference  included 
examples  of  problems  encountered  In  analysis  of  pharmacokinetic  data  ^ 

In  our  laboratory.  In  order  to  provide  space  for  eommanta  by  the 
panelists  (paraphrased  by  ua)  and  subsaquant  discussions,  numerical 
examples  are  omitted.  The  Interested  reader  la  referred  to  the 

paper  of  Boxenbaum  at  al.l  for  examples  of  pharmacokinetic  date  i 

which  typify  tha  Isauas  addrasaed  In  this  paper.  ' 


In  racant  y«ari  a ganeral  approach  to  gathering  and  organizing 
drug  dispoaltlon  infortnatlon  hai  haan  davaloped  and  la  frequently 
referred  to  aa  "pharnaeoklnatloa."  Phamacoklnetioa  hee  been  defined 
by  Qlbaldl  and  Farrier  aa  "the  etudy  of  the  time  oourae  of  drug  and 
ntetabolite  levela  In  different  flulda,  tlaeuea,  and  excreta  of  the 
body,  and  the  mathematloal  relatlonahipa  required  to  develop  nodela 
to  interpret  auch  datai"2 

For  the  purpoae  of  making  quantitative  therapeutie  deoiaiona, 
a pharmacokinetic  analyele  of  drug  data  can  contribute  in  aevaral 
waya>  Fir at  of  all,  a model  which  accurately  deacrlbea  the  time 
oourae  of  the  drug  In  the  body  aa  well  aa  in  particular  pnola 
can  be  quite  helpful  in  chooalng  doaing  alae  and  doelng  frequency! 

Thia  preaumea,  of  couree,  that  the  theraplet  hae  eome  notion  of 
deairable  upper  and  lower  bounda  for  drug  amounta  In  the  body  or  pool 
of  intereati  The  behavior  of  linear  ayatema  under  eingle  and  multiple 
dose  admlniat ration  aa  well  aa  oral  ingeation  end  Intravenoua  Infualon 
la  well  worked  out. 2, 3 Certain  "derived"  paramateri,  auch  aa 
"apparent  distribution  volume,"  "body  clearance,"  "terminal  elimination 
half-time,"  and  "axtent  of  bioavailability"  can  be  operationally 
uaeful  in  making  doaing  deciaiona.  Knowledge  of  the  influence  of 
pathologic  Btatee  on  theae  derived  parametera  can  reault  in  optimal 
dosing  regimena  in  the  face  of  diaease-lnduced  alterations  in 
distribution  and  elimination. 

Secondly,  inelghts  into  drug-body  Interactlona  can  be  obtained 
from  pharmacokinetic  analyses.  For  exomple,  a mathematically  eero- 
order  elimination  procase  impliea  "saturation"  of  an  elimination 
mechanism,  perhaps  a hepatic  ensyme- ays tern.  Observation  that  tha 
renal  clearance  and  body  clearances  of  a drug  are  identical  suggests 
that  the  kidney  is  the  major  elimination  organ.  A renal  clearance 
which  is  numerically  in  excess  of  glomerular  filtration  rate  implies 
tubular  secretion  i glomerular  filtration  aa  mechanisms  of  renal  drug 
procaeslng.  Insights  of  this  nature  contribute  to  therapeutic 
decision-making  by  alerting  the  therapist  to  special  precautions  he 
must  take  in  designing  a therapeutic  regimen  for  multiple  doelng  in  a 
patient  with  a diseased  liver  or  kidney. 

In  this  paper  we  wish  to  aummariie  some  current  approaches  to 
analyzing  pharmacokinetic  data  by  identifying  some  problem  areae  and 
preaanting  the  reaponees  of  panelleta  to  tham. 
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2.  PHARMACOKINETIC  MODELS:  MATHEMATICAL  DESCRIPTIONS  OF  DRUG 


disposition.  Conceptually*  the  pharaacoklnotle  medal  is  usually 
viewed  as  a system  of  Inter-connected  pools  or  compartments  (Figure  1). 
The  arrows  between  pools  represent  drug  transfer  directions  and  the 
syeibols  are  interpretnd  as  transfer  rates.  The  drug  Is 

considered  to  be  introduced  into  one  of  the  compartments  and  body 
fluid  samples  are  taken  from  one  or  more  of  the  pools.  Mathematically * 
the  model  nay  be  defined  as  a system  of  differential  equations. 

Linear  differential  equations  (first  order)  have  been  the  most  fully 
explored  and  frequently  applied  drug  disposition  models. 3 Although 
many  dtugs  undergo  apparent  first  order  distribution  and  elimination 
processes,  this  is  not  always  the  case.  Apparent  sero  order  or 
combinations  of  zero  and  first  order  processes  do  occur  in  drug 
kinetics,  which  render  models  of  the  Mlehaells-Menton  type 
applicable.^  However,  for  the  purposes  of  this  discussion  we  will 
confine  our  attention  primarily  to  the  class  of  linear  models. 


Integrated  solutions  to  systems  of  linear  differential 
equations  assume  a certain  almplicity  and  order.  An  n-compartmant 
open  model  (with  bidirectional  drug  transfer  between  all  adjacent 
pools)  yields  a linear  coiobination  of  n-exponentials: 


n__ 

’j  ■ Y_ 


Equation  1 


where  Dj  ■ drug  amount  or  concentration  in  the  pool; 

n ■ number  of  compartments;  A^,  “ arbitrary  parameters  of  the 

model  which  are  various  algebraic  combinations  of  the  original 
"mlcro"-rate  constants  volume  scalars,  and  initial  conditions. 

3.  METHODS  OF  PHARMACOKINETIC  DATA  ANALYSIS.  Devalopmant  of  a 
pharmacokinetic  analysis  usually  precedes  as  follows:  (1)  serial  drug 

concentrations  are  measured  in  e body  fluid  following  a dose 
administration,  (2)  some  procedure  is  used  to  choose  a class  of 
probable  models  which  are  appropriate  for  the  purpose  of  the  analysis, 
(3)  the  data  are  fitted  to  the  models  by  some  procedure  resulting  in 


SOME  LINEAR  PHARMACOKINETIC  MODELS 
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model  parameter  estimates,  (4)  an  assessment  Is  made  of  the  goodness 
of  fit  of  the  model  to  the  data,  and  (5)  a "most  acceptable"  model 
Is  chosen.  The  remainder  of  the  discussion  Is  to  focus  on  some  prob- 
lems encountered  In  steps  (2),  <3),  (4),  and  (5). 

Sty  2.  Soeclfylns  the  Class  of  Probable  Models.  Choosing  the 
class  of  models  to  os  considered  usually  Involves  a preliminary 
study  of  the  concentratlon/tlme-course  data.  If  a cartesian  plot 
reveals  a pradomluatly  llnaar  decay  profile,  then  a sero  order 
model  or  Hlehaells-Menton  model  is  usually  considered.  Curvilinear 
decay  curves  are  rendered  segmentally  linear  on  log-concentration/ 
time  plots  If  the  data  behaves  as  a poly-exponentlal,  Tha  nunibar  of 
stralght-llne  segments  can  be  used  as  the  Initial  nusd>er  of 
exponential  terms  to  be  Included  in  the  model.  In  addition,  the 
slo  and  y-lntercepts  of  these  oegmcnts  can  be  used  as  starting 

point,  .c  ‘tteratlve  parameter  estimation  procedures,  Although  most 
pharmacokln.  iclats  precede  in  this  fashion  using  manual  or  partially 
automated  graphical  procedures,  attempts  have  been  made  to  fully 
automats  this  phase  of  the  analysis. S, 6 

A decision  must  be  made  regarding  the  exact  form  of  the  mathe- 
matical model  to  use  In  tha  data  fitting  phase.  While  data  may  be 
fitted  directly  to  systems  of  differential  equatlona,7t8  the  usual 
practice  la  to  use  the  integrated  form  of  the  model.  In  the  case  of 
linear  pharmacokinetic  models,  this  reduces  to  fitting  data  to  a form 
of  Equation  1.  The  analyst  must  also  decide  whether  to  parameterlse 
the  equation  explicitly  in  the  "mlcro"-rate  constants  (K^j)  or  use 

the  "macro"-rata  constants  (Aj^,  X^).  This  last  Issue  was  addressed 

by  one  of  tha  panellsta  (G.B.)  and  la  discussed  below. 

Step  3.  Fittins  the  Model  to  Pats.  Fitting  the  model  equations 
to  pharmacokinetic  data  Is  usually  dona  using  an  automated  least- 
squares  <18 ) program  such  as  SAAM'  or  N0NLIN.8  With  two  exceptions, 
currently  employed  pharmacokinetic  models  are  nonlinear  with  respect 
to  their  pnrameters  in  their  Integrated  forms  and  therefore  require 
nonlinear  LS  data  fitting  procedures  for  estimating  parameter 
values.  The  two  exceptions  are  one-compartment  open  models  with 
(a)  purely  sero  order  elimination  or  (b)  first  order  elimination 
(which  can  be  linearised  by  a log.  transformation  of  tha  evitlre 

model) . Among  problems  encountered  In  this  stage  of  the  analysis 
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are  (1)  appropriateness  of  the  LS  criterion  for  minimization, 
especially  as  regards  deviations  of  the  system  under  study  from 
assumptions  inherent  In  the  LS  approach  and  the  large  Influence 
that  aberrant  data  values  can  have  on  the  parameter  estimates, 
and,  (2)  whether  and  how  to  "weight"  data  for  the  analysis.  These 
two  problems  constitute  part  of  the  requirement  for  assessing 
adequacy  of  the  model  (addressed  by  panelist  R.H.) 


Step  4.  Assessing  Goodness  of  Fit.  An  evaluation  of  the  goodness 
of  fit  ^ the  model 'aquation(s)  to  the  data  is  a highly  desirable 
procedure  in  pharmacokinetics.  The  exact  form  that  this  assessment 
takes  will  depend  upon  the  overall  objectlva  of  the  pharmacokinetic 
analyals,  models  uaed,  and  the  fitting  procedure  employed.  For 
example,  the  analyst  may  ba  primarily  Interested  in  developing  a 
deacriptlve  equation  employing  "mecro"-tete  constants  to  uas  in 
computation  of  "derived"  perameters,  or  hla  principal  intent  may  be 
to  eatlmate  "macro"-'rate  conatanta  of  a specific  compartmantel  model; 
thaaa  divergent  objeotlves  will  determine  the  criteria  as  wall  at  the 
technique  employed  in  Judging  goodness  of  fit.  If  a LS  data  fitting 
procedure  has  been  employed,  use  of  residual  plots  and  analysis  of 
residuals  for  their  dietrlbutlonal  properties  la  appropriate. 

We  have  employed  these  techniques  to  evaluate  soma  of  our  pharmaco- 
kinetic data  analyses  and  found  them  to  be  particularly  useful.  Plots 
of  weighted  etandardisad  residuals  against  drug  concentrations 
reveal  patterns  which  at  a glance  allow  asseaement  of  adequacy  of 
weighting  (atabilislng  the  variance  about  the  regression  line), 
model  specificity  (search  for  systematic  deviations  of  rosiduals  from 
the  ragrbaslon  line)  and  randomnean  of  residual  distribution.  Further 
analysis  of  residuals  alone  for  distributional  properties  (e.g.  mean, 
median,  variance,  akawnaas,  kurtoala,  specific  teats  of  normality)  has 
been  enlightening  but  not  alwaye  useful.  As  pointed  out  by  panelist 
Dr.  R.  Hogg,  the  use  of  normality  tests  may  constitute  too  severe  a 
crita  'ion  for  use  in  en  area  where  the  validity  of  normal  assumptions 
era  In  aarioua  question  from  tha  outaet.  In  this  regard,  it  was 
auggeatad  by  one  of  the  panelists  that  the  Shaplro-Wllk^^  taat  for 
normality  might  be  reasonable. 


Step  5.  Cheoalna  the  Moat  Acceptable  Model.  Ultimately,  all  data 
analysaa  must  ba  terminated.  This  phase  in  pharmacokinetic  data 
analysis  can  be  a troublesome  problem  when  no  cleer-cut  model  emerges 
more  convincingly  acceptable  than  others  in  the  clsss  of  models 


explored,  or  when  attempts  at  weighting  leave  the  analyst  puzzled 
about  adequacy  of  various  weighting  schemes.  Analysis  "termination 
criteria"  do  emerge,  however,  when  the  overall  objective  of 
the  analysis  la  integrated  with  the  other  phases  as  is  developed 
in  the  dlscusalon  below,  it  should  be  noted  also  that  a satisfactory 
termination  of  data  analysis  is  closely  tied  to  the  adequacy  of  the 
overall  design  of  the  pharmacokinetic  experiment.  Optimally,  the 
experimentalist  and  the  data  analyst  should  communicate  In  the 
experimental  design  stage  so  that  sampling  times,  number  of  replicates, 
etc.  are  designed  to  "optimise"  the  information  gain  from  the  effort. 
This  translates  into  a pre-experlmental  conalderation  of  models  to  be 
used  In  analysis  of  the  data  and  design  of  the  experimental  details  so 
that  statistical  estimates  of  modal  parameters  are  at  minimum 
variance  within  the  practical  constraints  of  experimental  technology 
and  costs. 


Dr.  G.  E.  P.  Boxt  A central  issue  which  Is  Inherently  important 
in  each  problem  area  cited  above  Is  the  overall  oblectlve  of  the 
exercise.  Clear  recognition  of  the  goal(a)  of  a particular  pharmaco- 
kinetic  experiment  leads  to  clarity  In  the  subsequent  data  analysis. 

Discussion t On  the  surface,  these  remarks  seem  almost 
unnecessary,  for  the  thoughtful  data  analyst  should  always  have  a 
clear  idea  of  the  goals  of  the  exercise.  However,  Dr.  Box  correctly 
detected  some  vagueness  In  the  objectives  of  analyzing  pharmacokinetic 
data  relevant  to  the  ultimate  use  of  the  results.  We  accept  Dr.  Box's 
perspicacious  comments  and  wish  to  cite  some  developments  in  recent 
pharmacokinetic  literature  which  contribute  to  clarifying  the 
objectives  of  pharmacoklnstic  analysas.  While  postulating  a class  of 
pharmacokinetic  models  in  terms  of  compartmental  schematics  with 
specific  Inter-eompartmental  connections  is  intellectually  attractive, 
the  effort  required  to  test,  evaluate,  and  find  an  acceptable  one  may 
be  far  in  excess  of  that  nscessary  to  fulfill  the  clinical  goals  of 
the  experiment.  If  knowledge  relevant  to  making  dosing  decisions  is 
the  principal  purpose,  then  a data  analytic  approach  which 
concantratss  on  estimation  of  macro-parameter  models  may  be  adequate. 
Wagner  has  recently  published  a series  of  articles  which  argue 
these  points  forcefully  and  which  propose  olmplifled  data  analvtlc 
techniques  for  computing  useful  pharmacokinetic  parameterti. 


If  the  analyst  perceives  that  the  objective  of  the  analysis 
Is  to  provide  tools  for  prediction  and  for  computing  "derived" 
pharmacokinetic  parametersi  and  not  to  test  specific  compartnental 
models  which  were  derived  from  differential  equations,  then  he  is 
not  restricted  to  exclusive  use  of  the  class  of  poly-exponentials. 

For  example,  Wold  et  al.l^  ptopose  the  use  of  cubic  spline  procedures 
for  computing  area  under  the  curve  and  terminal  drug  decay  half-tlma, 
and  give  a spaclfic  pharmacokinetic  example  to  llluattete  the  method. 

PR.  S,  ,V.  Roggt  LS  data  fitting  is  not  the  only  available  option 
and  "robuat'*  Btatletical  procedures  should  be  considered.  [In  his 
formal  prssentstlon^6  "on  Robust  Statistlcsl  Frocedursa,"  Dr. Hogg 
outlined  ssvsrsl  possible  alternatives  to  the  LS  approach  to  parameter 
astimstion. ] 

Discusslont  Use  of  robust  statistical  procedures  indeed  offers  a 
potential  contribution  to  analysis  of  pharmacokinetic  data.  Although 
these  approaches  pose  computational  difficulties,  they  are  attractlva 
both  from  the  point  of  view  of  (a)  relaxation  of  the  more  reetrictive 
normal  assumptions  Inherent  in  LS  procedures  and  (b)  mlnimirstlon  of 
the  effects  of  "erratic"  date  (outllara).  We  have  not  yet  applied  eny 
of  these  approaches  to  our  own  problems,  although  we  are  aware  of  one 
group  which  has.  Frome  and  Vakatal?  used  both  LS  and  Icast-abaolute- 
devlatlon  criteria  in  obtaining  paramatar  estlmatas  from  the  fit  of  e 
one-comperement  open  first  order  model  to  e set  of  pharmacokinetic 
data. 


Dr.  S.  Geissert  Consideration  should  be  given  to  the  use  of  the 
Cp  atetlaticiB  end  predictive  sample  rsuaa  methodsl9-21  for  asseaslng 
goodnass  of  fit  and  for  dsvaloping  data  analysis  termination  strategies, 

Discussioni  Ths  Cp  ststistic  was  origlnslly  dsrlvsd  for  uss 
in  uskiug  dscislons  about  ths  numbsr  of  tsrms  to  includs  in 
linaar  modsla  whsre  normal  assumptions  hold.  Thsrsfors,  uss  of 
this  approach  for  deciding  among  several  poly-exprmential  models 
must  be  viewed  as  an  ad  hoc  procedure,  the  theoretical  basis  for 
which  rameins  unexplored.  Heverthalees,  the  technique  is  appealing. 
Given  that  the  "total  squared  error"  computed  from  e nonlinear 
regression  boars  soma  inexact  but  ssmlqusntitstive  rslstionship  to 
the  "true"  squared  blsses  end  squared  random  errors,  than  plotting 
Cp  versus  p for  various  phsrmscoklnstic  modslst  may  yield  some 


t here  p might  be  considered  the  number  of  parameters  of  the  model. 


basis  for  choice. 


Use  of  predictive  lanple  reuse  methodology  for  assessment 
of  different  predictive  functions  Is  apparently  a rather  recent 
development  in  statistics.  The  available  papers  on  predictive 
sample  reuse  are  not  easy  for  the  non-statistlelan  to  understand, 
therefore I a brief  description  of  technique  In  the  present  context 
will  be  given.  Like  the  Cp  statistic,  a number  associated 
with  a given  fit  of  a speolfio  model  to  data  is  desired  which  will 
allow  dlaorlmlnacion  between  models  such  that  a most  reliably 
predictive  model  may  be  Identified.  This  number,  call  it  F^,  may 

be  computed  using  the  following  "data  reuse"  approach.  Model  j 
is  fit  to  all  the  data  less  the  first  datum  by  LS  and  the  residual 
sum  of  squares  la  recorded  (KSSi).  The  procedure  is  repeated 
after  replacing  the  first  datum  and  omitting  the  second  data 
point  and  the  resulting  RSSa  is  added  to  the  first.  This  is  repeated 
by  replacing  the  i£h  data  point  and  removing  the  (i+l)JEil  datum 


and  so  forth  until  F_ 
P 


The  entire  procedure  la 


replicated  for  each  proposed  model.  Then  all  F^  may  be  compared 

and  model  p*,  beyond  which  F^  does  not  get  appreciably  smaller,  may 

be  chosen  as  an  acceptable  model.  We  have  no  axpariancs  with  this 
technique  but  it  may  be  a useful  data  analysis  termination  strategy. 


FINAL  COMMENT.  While  following  up  on  recommendations  of  the 
panel,  we  ran  across  two  treatises  generally  covering  the  areas  of 
goodnasa-of-flt  and  data  analysis  termination  strategies  which  we 
feel  are  important  to  pass  on  to  Che  reader.  They  are  Daniel  and 
Wood's  book  (see  ref.  18)  and  a recent  paper  by  Hocking. 22  These 
sources  contain  discussions  of  other  techniques  which  may  be  applica- 
ble to  the  problems  addressed  in  this  paper. 
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STRACT.  Unprocessed  bran  (bran)  and  carboxy-methyl  cellulose 


(CMC)  were  added  to  regular  diets  of  overweight  and  normal  weight  volunteers 
to  determine  the  effect  on  serum  lipids.  Downward  mean  trends  of  cholesterol 
and  triglyceride  levels  were  found  In  all  groups  taking  bran  and  CMC  after 
twelve  weeks  except  the  overweight  men  Ingesting  CMC.  Downward  mean  trends 
for  cholesterol  ranged  from  0.74  mg/100  to  1.66  mg/100  per  week  and  for  the 
triglyceride  from  0.36  mg/1 00  to  4.78  mg/100  per  week. 


1.  INTRODUCTION.  Coronary  atherosclerosis  is  the  leading  causa  of 
death  In  the  United  States.* In  spite  of  this  atherosclerosis  was  rare 
in  this  country  before  1900.^  and  remains  almost  unknown  in  some  developing 
countries. 


Dietary  factors  are  under  constant  scrutiny,  and  a number  of  researchers., 
have  proposed  that  lack  of  dietary  fiber  mey  be  an  Important  causal  factor, 
because  fiber  Is  abundant  In  the  diets  of  rural  people  In  developing  countries 
where  atherosclerosis  Is  rare  and  has  decreased  In  the  diets  of  westerners 
during  the  rise  of  fatal  atherosclet'osls. 

Dietary  fiber  could  lower  serum  lipids  In  various  ways.  It  Is  hygroscopic 
and  might  absorb  emulsified  lipids  taken  with  the  diet.  Dietary  fiber  would 
also  absorb  cholesterol  secreted  In  the  bile  and  thus  reduce  its  reabsorption 
In  the  small  Intestine.  Increased  dietary  fiber  also  reduces  gastro- 
intestinal transit  time  and  thismight  also  reduce  absorptions  of  lipids. 

In  an  attempt  to  determine  whether  dietary  fiber  reduces  serum  lipids, 
we  performed  the  following  study. 

2.  METHODS.  Forty-four  healthy  men  ages  ranging  from  23  to  66  years, 
vol unteered  for  a 12-week  study.  All  were  on  duty  at  the  Armed  Forces 
Institute  of  Pathology  when  the  study  began.  Most  were  pathologists,  and 
the  remainder  were  trained  in  one  of  the  medical  specialties.  All  under- 
stood the  purpose  of  the  study  and  were  "dedicated''  volunteers.  They 
continued  their  regular  diets,  did  not  alter  their  Ufa  styles,  and 
maintained  body  weight. 
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The  men  were  divided  height/weight  ratio  and  age  Into  three 
equivalent  groups—control , bran  and  cellulose.  Each  member  of  the  bran 
group  added  S6  gm  of  unprocessed  bran  to  his  dally  d1et--28  gm  (1  ounce) 
with  breakfast  and  28  gm  with  his  evening  meal --a  dally  supplement  of 
about  6 gm  of  nonnutritive  fiber.  Each  member  of  the  cellulose  group 
added  6 gm  of  cellulose*  to  his  dally  d1et--3  gm  at  breakfast  and  3 gm 
with  the  evening  meal.  This  Is  all  nondigestable.  so  both  groups  Ingested 
approximately  6 grama  of  nonnutritive  hygroscopic  substance,  These 
supplements  were  Ingested  for  12  weeks. 

Fasting  blood  samples  were  collected  at  intervals  of  two  weeks;  serum 
cholesterol  determinations  were  done  every  two  weeks;  and  serum  triglyceride 
determinations,  every  four  weeks.  (The  control  group,  however,  had  no 
triglyceride  determinations  on  the  fourth  week.) 

During  the  course  of  the  study,  9 of  the  44  men  dropped  out--  4 were 
transferred,  4 could  not  tolerate  the  unprocessed  bran,  and  1 man 
substituted  sweetened  bran  ("All  Bran")  for  unprocessed  bran.  Of  the  remain- 
ing 36,  18  had  a "normal"  weight  and  17  were  overweight.  Linear  regression 
to  estimate  the  trend  of  each  man's  serum  lipids  was  calculated  and  the 
trends  were  averaged  for  each  group.  Because  only  slopes  were  averaged,  the 
variation  Introduced  by  differences  In  lipid  levels  from  subject  to  subject 
was  removed— a valid  approach  since  each  subject  acted  as  his  own  control 
In  the  trend  analysis.  A refinement  of  the  analysis  Involved  the  recompu- 
tation of  the  average  trends  per  group  with  each  subject's  degree  of 
consistency  of  trend  used  as  a weight  In  obtaining  a weighted-average  trend 
(where  degree  of  consistency  was  measured  as  the  reciprocal  of  the  variance 
of  the  slope).  The  weighted-average,  while  conferring  greater  Importance 
to  consistent  trends,  also  served  to  be  selective,  giving  some  subjects 
considerable  prominence.  Therefore,  special  care  was  taken  In  the 
Interpretation  of  the  weighted  averages  to  ensure  that  they  were  also 
representative  of  the  group. 

The  probabilities  were  obtained  from  Student's  t-test  on  the  average 
trends  (weighted  and  unweighted)  for  each  group  under  the  null  hypothesis  of 
zero  trend  against  the  one-sided,  alternative  hypothesis  of  negative  slope. 

3.  RESULTS. 


_ The  trlolycerlde 

normal -weight  subjects  sating  bran  and  cellulose. 


levels  were  sharply  lowered  In  the 

The  group  of  overweight 
subjects  eating  bran  and  cellulose  and  the  control  group  did  not  show  this 
struing  trend.  See  Fig.  1.  In  addition  mean  cholesterol  levels  fall  In 
the  group  of  overweight  men  taking  bran.  The  graphs  In  Fig.  1 ere  means  of 
the  Individual  trends  so  that  the  variation  In  lipid  levels  from  subject  to 
subject  was  removed. 


^Purchased  as  sodium  carboxy-methyl -cellulose  tablets,  0.5  gm,  from 
Interstate  Drug  Exchange  Mfg.  Co.,  Plalnvlew,  Long  Island,  New  York  11803 
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Using  a prallmlnary  cutoff  at  P < .lOt  four  of  the  seven  negative 
slopes  In  the.  bran  and  cellulose  groups  were  statistically  significant. 

See  Table  1.  Expressed  as  a percentage  of  the  Initial  levels,  the 
reduction  was  11%  for  the  group  taklno  CMC  and  SOX  for  the  group  taking 
bran.  Three  of  the  mean  trends  that  failed  the  statistical  cutoff  were 
groups  of  overweight  men.  Because  of  the  greater  variability  of  serum 
lipid  trends  among  the  overweight  men,  a refined  analysis  was  performed 
consisting  of  computing  weighted-average  trends  using  as  weights  the 
degree  of  consistency  of  each  Individual's  trend.  The  weighted  trends 
generally  show  a numerically  steeper  rate  of  reduction  of  serum  lipids 
together  with  enhanced  statistical  probabilities.  Thus,  six  of  the  eight 
trends  for  the  bran  and  cellulose  groups  were  statistically  significant 
(P  < .05)  downward  trends.  The  only  non-downwsrd  trend  was  the  overweight 
men'll nges ting  CMC  whose  serum  cholesterol  unaccountably  Increased.  This 
contrasts  with  the  decreased  triglyceride  level  for  this  same  group. 

Since  each  subject  served  as  his  own  control— his  pretreatment  level 
was  the  Initial  point  for  his  own  trend— no  reference  thus  far  has  been 
made  to  the  actual  control  groups.  They  served  to  determine  whether  an 
unknown  or  subconscious  factor  Influenced  serum  lipids  during  the  study. 

The  average  trends  for  the  control  group  revealed  no  such  factor.  See 
Table  1.  One  of  the  weighted-average  trends— the  triglyceride  levels  In 
the  normal-weight  control  group— did  fall  with  P ■ .12.  To  be  conservative, 
therefore,  this  slope  was  subtracted  from  the  slopes  of  the  bran  and 
cellulose  groups  for  the  normal -weight  men.  In  computing  the  probability 
statements. 

Laboratory  variation,  expressed  as  a ratio  of  the  laboratory  variance 
to  the  residual  experimental  variance,  was  1/16,  a negligible  quantity  as 
a possible  factor  affecting  the  analysis  and  Interpretation  of  these  data. 
The  standard  deviation  for  the  laboratory,  calculated  over  each  two-month 
period,  was  found  to  be  5 mg/100  ml  for  serum  cholesterol  and  B mg/100  ml 
for  serum  triglyceride. 

4.  COMMENT.  A number  of  studies  reporting  the  effects  of  whole  or 
fractional  graTn  products  on  serum  lipids  have  produced  varied  results, 
but  the  majority  support  the .view  that  whole  grain  and  whole  grain  products 
tend  to  lower  serum  lipids. 1*~22  m our  study  CMC  lowered  the  average 
triglyceride  levels  by  75X  In  normal-weight  subjects,  and  bran  lowered  the 
average  serum  triglyceride  levels  of  normal-weight  subjects  by  BOX.  Wo  do 
not  know  the  mechanisms  by  which  bran  and  CMC  lowered  serum  lipids.  Some 
possible  mechanisms  suggest  that  nonnutritive  substance  (1)  Increases  the 
excretion  of  bile  acids  by  Increasing  catabolism  of  cholesterol  In  the 
liber, 1»  (2)  shortens  gastrointestinal  transit  time,  thus  allowing  less 
time  for  lipids  to  be  absorbed,  and  (3)  absorbs  water,  bile  salts  and  other 
solutes  Including  lipids,  thus  reducing  absorption  of  lipids.  None  of  these 
hypotheses  however,  explains  the  fact  that  serum  triglycerides  1n  our  normal - 
weight  men  dropped  more  quict'ly  than  serum  lipids  In  our  overweight  men 


Ingtsting  bran  and  CMC.  If  nonnutritive  substance  lowers  serum  triglycerides 
more  quickly  In  non-obese  men»  then  other  dietary  factors  probably  play  a 
roll.  One  of  these  could  be  the  Ingestion  of  ekcssslve  amounts  of  refined 
carbohydrates  by  the  overweight  men.  Sugar*  for  example,  not  only  .. 
contributes  to  obesity  but  Is  an  Important  cause  of  hyperlipidemia.^'^ 

Our  study  supports  the  opinion  that  nonnutrltivi  substance  (bran 
and  CMC)  lower  serum  lipids.  And  In  particular,  we  found  that  the  most 
striking  lowering  effect  was  on  the  serum  triglycerides  In  men  taking 
CMC  who  were  not  overweight. 
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rtQURE  1. 

SERUM  CHOLESTEROL  AND  TRIGLYCERIDE  LEVELS  FOR  THREE 
V0LUNTEBR8--THE  CONTROL  GROUP,  THE  GROUP  INGESTING  BRAN 
AMD  THE  GROUP  INGESTING  CELLULOSE.  iNOfiSTINO  BRAN. 


Table  1.  Mean  Tends  (F,  mg  % per  week)  of  Serum  Triglyceride  and 
Serum  Cholesterol  Levels  as  Determined  on  35  Volunteers 
for  12  Weeks. 


Averages 


Weighted  Averages 


Me  * 

of  s(F)  Prob. 

Vol. 


Bwt  s(bw)  Prob. 


Serum  Triglyceride 


Controls 


Normal  Weight  9 
Overweight  5 

Bran 

Normal  Weight  4 
Overwel ght  6 

Cellulose 

Normal  Weight  5 
Overwel ght  6 


Controls 

Normal  Weight  9 
Overweight  5 

Bran 

Normal  Weight  4 
Overweight  6 

Cellulose 

Normal  Weight  5 
Overweight  6 


■0.43 

0.93 

NS 

■0.33 

1.70 

NS 

-2.88 

1.77 

0.08 

- .60 

1.86 

NS 

■4.78 

1.95 

0.02 

■0.36 

1.72 

NS 

Serum  Cholesterol 

0.22 

0.70 

NS 

0.37 

0.78 

NS 

•1.49 

0.96 

0.07 

•1.65 

1.11 

0.08 

•0.74 

0.85 

NS 

1.04 

0.75 

NS 

0.53 

0.30 

0.12 

0.80 

1.36 

NS 

3.48 

1.69 

0.05 

2.06 

0.71 

0.01 

4.12 

0.90 

0.001 

.85 

0.38 

0.02 

0.09 

0.56 

NS 

-0.78 

0.62 

NS 

-3.42 

0.59 

0.001 

-2.85 

0.68 

0.001 

-0.98 

0.75 

0.10 

1.52 

0.80 

NS 
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ANALYSIS  07  AM  BRROR-TIME  RESPONSE  PERFORMANCE 

Mlcha«l  Hacikaylo 
U.S.  Army  Elactronlcs  Coimnand 
Night  Vision  Laboratory 
Fort  Balvolr,  Virginia 


ABSTRACT » Tha  analyaas  of  tha  arror-tlma  ratpnnae  parformancaa  of 
groups  of  naive  subjsots  permitted  to  toake  discrata  tight/wrong  decisions 
ara  prasantad  for  three  axparimantal  display  pauels  of  Inoreasing  com- 
plexity. The  panel  designs  ware  based  on  a circular  representation  of 
light  bulbs » where  the  lights  corresponded  to  the  angles  of  a circle.  Tha 
first  panel  design  consisted  of  a ring  of  lights  that  portrayed  one  con- 
tiguous angular  representation  by  the  lights.  A contiguous  representation 
of  light  was  defined  as  a domain.  The  complexities  of  the  second  and 
third  designs  were  increased  to  two  contiguous  semicircular  representations 
of  the  ring  of  lights • where  for  each  design  the  semicircular  representa- 
tives were  defined  as  two  domains.  The  function  of  tha  panels  was  to 
display  the  aalmuthal  angular  source  location  of  infrared  lasers  when 
detected  by  infrared  detection  systems. 

The  subjects  were  randomly  selected  from  a large  population  having 
no  prior  knowledge  (aero  degree  of  learning)  of  tha  panels  and  separated 
Into  three  groups  of  seven  subjects  each.  Bach  subject  evaluated  two 
of  the  three  panels  in  an  ABBA  manner  for  one  and  only  one  set  of  six 
trials  par  panel.  Such  a group  of  subjects,  constrained  to  tha  same 
degree  of  learning  of  tha  panels  and  limited  to  the  one  set  of  trials,  is 
defined  as  an  elgsngroup  for  this  analysis. 

The  subjects  were  instructed  to  mark  on  a response  panel  as  accurate- 
ly and  rapidly  as  possible  the  corresponding  angular  light  of  tha  stimulus 
panel,  2JiSi  the  display  panel.  The  response  panel  was  a five  inch  circle 
drawn  on  a 8 inch  by  10  inch  paper. 

The  number  of  errors  of  the  eigangroupa  was  analysed  as  a function 
of  time  for  each  of  the  experimantal  designs.  It  was  found  that  for  the 
experiment,  the  error-time  response  equation  is  log  B ■ -2n  log  T 4-  K, 
where  B is  the  number  of  total  errors  per  eigangroup,  n is  the  number  of 
domains  of  tha  stimulus  panel,  T is  the  mean  tine  for  the  total  number  of 
trials  for  each  elgsngroup  par  system,  and  K is  a constant.  It  was  nec- 
essary to  introduce  new  terms,  i.e.,  domain  and  eigangroup  to  unambiguous- 
ly define  the  stimulus  panel  and  interpret  the  results  consistent  with 
the  equation. 

1.  INTRODUCTION.  The  purpose  of  this  pupsr  is  to  present  sn  error- 
time  analyses  of  tha  designs  of  the  informational,  display  panels  of 
infrared  detection  systems.  The  systems  detected  and  displayed  the 
aslmutUal  angular  position  of  a laser  source  to  a crew  during  a laser-tank 
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engagement  as  aliuwn  In  flgvtre  1.  Since  Lhn  artor-tLmc  response  perform- 
ance of  a weli-tralnucl  crew  would  more  apt  reflect  tlie  selectivity  and 
training  of  thn  crew,  an  evaluation  procedure  was  required  that  would 
reflect  the  panel  desi  's  rather  than  tlie  personnel  capability,  training 
and  cumulative  learning  process.  To  implement  the  procedure  it  was 
decided  to  employ  naive  subjects  wVio  had  no  knowledge  of  the  systems, 
exposed  only  to  instructional  procedures  (without  preliminary  learning 
trials)  and  constrained  to  imike  p^ie  an^tl  onl^  orw^  decision  per  trial.  The 
decision  would  be  considered  right  or  wrong.  The  results  sliould  be 
dlffi'rent  tlian  the  cumulative  Learning  performance  wlie.re  error  daclsiona 
were  allowed  until  the  correct  division  was  made,  (lagnd  and  Foster  (1949). 
In  other  procedures,  errors  were  treated  as  partially  correct  answers 
(Fitts  and  Seeger,  1953),  and  tliu  error-time  response  data  are  statisti- 
cally treated  to  determine  the  mean  and  standard  deviation  of  the  error- 
time  parameters.  These  parameters  are  interpreted  as  how  far  from  the 
correct  value  the  errors  are  as  a function  of  learning  and  response  times. 
The  determination  of  the  number  of  discrete  errors  ns  a function  of  time 
for  a group  of  subjects,  who  were  not  trained  nor  subjected  to  the 
cumulative  learning  process,  is  not  apparent  in  literature, 
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In  this  pap«ti  the  error-tine  reeponee  patfomance  of  groups  of 
Individuals  aubjeotad  to  only  one  set  of  trial!  resulted  in  a frequancy- 
distributlon  curve  !?hich  was  different  than  a cumulative  performanea  ourvsi 
The  mathematical  analysis  of  error-time  response  data  of  random  groups  of 
individuals  subjected  to  the  one-trial  set  method  appears  to  be  signifi- 
cantly new<  To  assure  that  tha  groups  were  not  subjected  to  a cumulative 
learning  prooessi  each  subject  was  instructed  as  to  the  procedure  and 
then  dismissed  after  evaluating  tha  panels.  In  this  manner)  each  group 
was  considered  to  be  of  the  sane  or  identical  state  of  conditioning  or 
training  for  all  seta  of  trials.  Such  groups  are  defined  as  aiaanarouDS. 
(The  word)  eiaen  means  proper)  inherent)  peculiar).  In  a fuller  context) 
the  error-time  response  performance  of  elgengroups  Is  properly  satisfied 
only  when  the  groups  are  subjected  to  the  one-set  trial  method. 

2.  MBTHOD.  The  error-time  response  performance  data  ware  obtained 
on  panel  designs  similar  to  those  of  Fitts  and  Seeger  (1953) . Due  to  tha 
similarity)  tha  Fitts  and  Seeger  experiment  is  briefly  described.  Their 
experiment)  in  essanca)  was  to  determine  the  learning  skills  of  matched 
groups  of  Individuals  to  a ainglafold  response.  The  stimulus  panel  had 
a ring  of  eight  equally  spaced  light  bulbs.  The  stimulus  was  a light 
flashing  on.  This  action  keyed  the  subject  to  associate  the  light  with 
tha  angular  position  on  the  ring.  The  response  panel  had  a stylus.  The 
response  was  the  action  by  tha  subject  in  moving  the  stylus  to  the 
corresponding  position  on  the  response  panel  as  tha  interpreted  position 
of  the  stimulus  panel.  (Two  variations  of  the  stimulus  panel  ware 
geometrically  configured  with  increasing  complexity  to  simulate  the  ring 
design.  The  corresponding  response  panels  were  also  increased  in  com- 
plexity. The  8-R  compatibility  of  those  designs  were  also  determined.) 

Tha  panel  designs  reported  here  were  also  based  on  a circular  repre- 
sentation of  squally  spaced  light  bulbs.  Since  the  physical  entity  is 
tha  light  bulb  embodying  the  stimulus)  tha  physical  entity  (light  bulb) 
is  defined  as  tha  etimulant.  The  stimulant  and  the  configurational 
display  of  the  stimulant  ([ring  of  light  bulbs)  on  tha  stimulus  panel  is 
defined  in  this  paper  as  tha  sianificand.  The  signifioands  were  geomet- 
rically configured  to  increase  tha  complexity  of  the  stimulus  panel  for 
the  ainglafold  response.  The  three  designs  are  now  described. 


Panel  A.  Tha  signlflcand  of  the  panel  was  a three  and  one-half  inch 
diameter  ring  of  36  equally  spaced  light  bulbs  as  portrayed  in  figure  2(a). 
The  ring  was  positioned  on  the  front  surface  of  a box  4 inches  wids) 

6 inches  long  and  2 inches  deep.  The  light  bulbs  were  angularly  marked 
in  degrees  from  sero  to  360  degrees  in  ten  degree  increments  in  a clock- 
wise direction  with  lero  at  the  top.  The  continuous  clockwise  direotion 
of  the  marked  light  bulbs  is  considered  as  a domain  of  the  signlflcand) 
i.s.)  one  contiguous  representation  of  the  stimulus  panel  as  portrayed  in 
figure  2(b) . When  a light  came  on  it  signified  the  angular  position  on 
the  ring. 
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Final  Bt  Thi  slgnlfloand  of  tha  panel  waa  a three  and  one-half  Ineh 
diameter  ring  of  36  equally  spaced  light  bulbs  marked  In  argular  mils  as 
portrayed  in  figure  2(o).  The  ving  was  positioned  on  the  front  eurfaee 
of  a box  identical  In  dimensions  as  in  panel  The  light  bulbs  were 
angularly  marked  in  Tidie  from  sere  to  3200  in  177.78  mil  inorements  in  a 
countarelockwise  dire-:i.lon  with  sero  at  the  top  for  one-half  of  the  circle. 
(There  are  6400  mile  per  360  degrees  of  a cirole»  theraforsi  each  position 
corresponds  to  177.78  mils  as  well  as  10  degreos.)  The  angular  marking 
started  at  sero  again  at  the  bottom,  and  continued  in  the  counterclockwise 
direction  to  3200  at  the  top.  The  two  helves  completed  the  circle.  The 
two  counterolockvlse  directional  iterations  of  the  marked  light  bulbs  are 
considered  as  two  domains  of  the  signlficand,  i.e.,  two  contiguous  repre- 
aentationa  of  the  panel  as  shown  in  figure  2(d).  vnten  a light  came  on,  it 
signified  the  angular  position  on  the  ring. 

Panel  C.  The  signiflcand  of  the  panel  was  a stimulant  in  the  form 
as  an  alphanumeric  readout  display  as  portrayed  in  figure  2(e).  Tha 
display  window  was  positioned  on  the  front  surface  of  a box  of  identical 
dimension  as  in  panel  A.  The  first  of  three  characters  was  a letter,  L 
or  R,  and  the  next  two  were  digits  ranging  from  00  to  32.  The  letter  R 
indicated  a circular  representation  in  a clockwise  direction.  The  numeri- 
cal values  indicated  the  angular  position  in  100  mil  inorements  (equivalent 
to  5.62S  degrees)  with  sero  at  the  top  and  increasing  to  3200  mils  for  one- 
half  of  the  circular  repinsentation.  The  letter  L indicated  a circular 
representation  in  a counterelockwiee  direction.  The  numerical  values 
indicated  tha  angular  position  in  100  mil  inorements  with  sero  at  the  top 
and  increased  to  3200  mils  for  the  completion  of  the  circular  representa- 
tion. Tha  one  oleckwiae  and  one  counteroloekwiao  directional  representa- 
tions of  the  circle  are  considered  as  two  domains  of  the  signifioand,  i.e., 
two  semicircular  representations  of  the  stimulus  panel  as  shown  in  figure 
2(f).  When  an  alphanumeric  readout  came  on,  it  signified  the  angular 
position  on  tha  circular  raprasantation. 

Response  Panel.  The  response  panel  was  identical  for  each  panel. 

A f iveinon  circle  was  drawn  on  a 8 x 10  inch  sheet  of  plain  paper.  The 
circle  was  divided  into  quadrants  and  marked  Into  degrees  and  mils  as 
follows!  Zero  degrees  (0*)  and  sero  mils  (0  mils)  were  marked  at  the  top. 
In  a clockwise  direction,  each  quadrant  was  successively  marked  90*, 

1600  mils;  ISO*,  3200  mils{  270*,  4800  mils;  and  again  at  the  top,  360*, 
6400  mils.  A pencil  was  used  for  marking  angular  positions  with  an  "X" 
on  the  circle. 

3.  PRDCBDURE.  Twenty-one  U.B.  Army  enlisted  men  of  all  ranks,  who 
ware  not  formally  matched  but  had  no  prior  knowledge  of  the  experimental 
panels,  were  randomly  selected  and  asperated  Into  three  groups  of  seven 
subjects  each.  One  at  a time,  each  subject  was  thoroughly  briefed  on  the 
operational  procedures  of  two  preselected  display  panels  just  prior  to 
evaluation.  The  subject  was  instructed  as  follows!  As  quickly  and  as 
accurataly  as  possible,  read  the  angular  representation  of  a light  (or 
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digital  raadeut)  and  tha  appropriaca  diraotlon  whan  the  atimulua  light 
came  on»  and  mark  with  an  "X"  that  angular  poaition  »n  the  oirele  of  the 
sheet  of  paper.  The  poaition  of  tha  interaeotion  of  the  "X"  was  consid- 
ered to  be  the  angular  position.  For  famillariaatlon  tha  aubjaet  was 
given  two  preliminary  runs  if  so  desired. 

Bach  subject  performed  a sariee  of  six  trials  on  tha  two  prasaleotad 
eases  in  an  ABBA  manner  for  a total  number  of  12  trials.  On  a prepro- 
grammed aehedule  of  randomneaa,  each  subject  read  the  angular  position 
and  marked  the  circle  as  quickly  and  accurately  as  possible.  There  were 
three  "X'"s  per  response  (paper)  panel  since  one  penal  was  supplied  for 
each  Ai  B,  B»  A sequence.  The  time  Interval  from  when  the  light  came  on 
to  when  the  subject  marked  the  panel  was  measured  to  0.001  second,  how- 
ever. the  time  for  each  trial  was  recorded  to  the  nearest  0.01  second.  It 
is  assumed  that  the  reaction-time  error  introduced  by  the  investigative 
team  for  the  time  measurements  was  constant  for  the  trials. 

Upon  completion  of  the  set  of  trialsi  each  subject  was  dismissed. 

Care  was  taken  to  insure  that  subsequent  subjects  for  evaluation  did  not 
associate  with  any  of  the  previously  dismissed  subjects. 

The  display  panels  v/ere  evaluated  in  a room  that  consisted  of  a 36 
inch  high  bench,  chair,  and  associated  equipment  required  to  activate 
the  lights  of  the  panels.  Tha  panels,  two  at  a time,  ware  positioned  one 
on  top  of  the  other  on  the  bench  in  the  foll.owing  sequence!  for  eigen- 
group  1.  panel  A on  panel  C|  for  eigengtoup  2.  panel  B on  panel  C(  and 
for  eigengroup  3,  panel  B on  panel  A.  It  la  to  be  noted  that  tha  sequsnoe 
for  eigengroup  2 was  incorrect  to  maintain  proper  counterbalancing  block 
order.  However,  this  flaw  did  not  appear  to  be  evidenced  in  tha  analyses 
as  described  later.  The  only  persons  permitted  in  the  room  ware  the 
subjeov  and  the  Investigative  personnel. 


For  sach  stimulus  panel  angle  (light) . three  angular  resolution  ranges 
tot  determining  the  accuracy  of  the  response  panel  angle  "X"  were  con- 
sidered to  be  a)  2^40  dagreesi  b)  420  degrees)  and  o)  410  degress.  The 
readout  angle  was  considered  as  an  error  if  the  response  angle  was  greater 
than  the  angular  resolution  for  each  stimulus  angle,  l.s..  each  response 
angle  would  be  a right/wrong  decision  for  three  ranges. 

The  angular  position  marked  on  the  response  panels  (from  the  pre- 
programmed readout  angles)  were  measured  in  degrees.  This  was  done  by 
using  a transparent  template  graduated  to  0.3  degree  which  was  superimposed 
on  tile  marked  response  panels.  The  accuracy  of  the  marked  angle  was 
measured  to  40.5  degrees. 


Tha  average  tlmae  of  the  elx  trials  for  the  panels  that  each  subject 
evaluated  ware  determined • Tha  number  of  errors  and  associated  time  for 
each  of  the  252  trials  were  tabulated  for  data  reduction! 

4i  RESULTS.  The  mean  time  and  number  of  angular  errors  in  each 
range  for  the  set  of  six  trials  for  the  subjects  are  shown  In  Table  1. 

The  table  separates  the  subjects  In  their  respective  elgengroups  for  the 
panels  evaluated.  The  average  of  the  mean  times  as  well  as  the  total 
errore  per  range  for  tha  groups  for  each  panel  are  also  shown  In  Table  1. 
Note  that  tha  errors  are  considered  as  completed  events  and  that  the 
standard  deviation  of  the  angular  errors  have  no  significance  In  this 
analysis . 

An  accepted  method  for  the  portrayal  of  the  frequency-dietrlbutlon 
data  of  Table  I,  Is  to  plot  the  number  of  errors  (par  subject)  as  a 
function  of  the  mean  time  (per  subject).  To  illustrate  the  method,  plots 
of  the  number  of  errors  in  the  range  +40"  as  a function  of  the  mean  time 
of  the  subjects  of  the  groups  for  each  of  the  panels  are  shown  In  figure 
3.  The  data  presented  in  such  a fashion  cannot  be  clearly  interpreted. 

The  only  two  significant  observations  that  can  be  made  ere  ea  follows! 

Ths  first  Is  that  tha  arror-maan  time  reaponse  performanca  curves  of  the 
two  elgengroups  for  the  eame  design  exhibit  some  dlagree  of  eimllarlty, 
and  the  second  Is  that  most  of  the  arrors  occur  between  3 and  5 sacond 
tlma  Interval. 

however,  If  the  data  are  plotted  In  a different  fashion,  a strikingly 
new  vet  of  parametric  curves  ere  genareted.  If,  for  the  data  of  Tablt  1, 
the  number  of  total  arrors,  B In  the  range  +40*  per  algengroup  is  plotted 
ea  e function  of  the  mean  time  on  a log  E ve  log  T acala,  It  can  be  seen 
that  two  distinct  linear  curves  are  generated  as  shown  In  figure  4.  Tha 
elgtngroup  datum  points  for  panel  A fall  on  ona  line,  and  aigongroup  datum 
points  of  panel  B end  panel  C fell  on  a second  line.  Ths  two  curves  are 
separated  by  at  luaat  one  order  of  magnitude  in  the  error  count,  and  this 
eepareblon  indicates  that  there  le  an  unlquenees  between  panel  A and 
panels  B and  C. 

The  curve  for  panel  A can  be  expreesed  ee 

log  ■ -2  log  + 1.16  (1) 

where  la  the  total  number  of  errore  +40*  per  elgsngroup  for  panel 

A{  T^  Is  the  mean  time  per  eigangrnup  of  panel  A|  and  1.16  la  a constant. 

The  negetlva  sign  la  interpreted  to  mean  that  as  the  amount  of  time  Is 
Inoreeeed  for  reading  the  stimulus  panel  angle  end  marking  the  nn  the 
response  panel,  the  numbur  of  errors  decrease. 


HUMAN  PAUTORS  DATA.  M-252 


Eigcngroup  1 

PAn«l  A 


Panel  0 


Subjeot  Mean  Tina  Angular  Brrora  Mean  Tine  Angular  Grrora 
(Sac)  (Number)  (Sac)  (Niuaber) 

tAQ*  ±20*  AIQ* ±40*  :li20'  tlO* 


1 

5.85 

0 

0 

0 

5.86 

1 

1 

1 

2 

2.83 
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2 
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3.45 
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0 
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5.84 
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1 
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2.96 
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1 

1 
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2.95 

1 

1 

2 

3.52 

4 

5 
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6 

4.68 

0 

0 

2 

3.98 

0 

0 

1 

7 

3.17 

3r87(Ava) 
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■M 
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5 

ll 

4.86 

4.33(Ave) 

1 

8 

A 

10 

ll 

Blgengroup  2 

Panel  B 

Penal  C 

Subject 

Mean  Time 

Angular  Grrora 

Mean  Tima 

Angular  Brrora 

(Sec) 

(Number) 

(Sac) 

(Numbt^r) 

±20* 

ilO* 

±.^9* 

*19* 

8 

3.78 

0 

0 

1 

3.51 

1 

1 

1 

9 

4.38 

1 

1 

3 

3.37 

0 

0 

0 

10 

4.11 

0 

0 

1 

3.66 

0 

0 

0 

11 

5.06 

0 

0 

1 

3.87 

0 

1 

3 

12 

4.59 

0 

1 

2 

4.23 

0 

1 

1 

13 

11.51 

0 

1 

3 

10.08 

0 

0 

0 

14 

5.24 

5t52(Ave) 

i 

1 

5 

4 

13 

kdl 

7793 (Ave) 

2 

3 

1 

6 

1 

10 

Blgengroup  3 

Panel  B 

Panel  A 

Subjeot 

Mean  Tima 

Angular  Grrora 

Mean  Tima 

Angular  Grrora 

(Sac) 

(Numbar) 

(Sac) 

(Numbar) 

..JiSL 

,-iAgl 

15 

2.95 

3 

5 

5 

2.90 

0 

1 

4 

16 

3.3i 

4 

6 

6 

2.75 

0 

1 

3 

17 

3.85 

3 

4 

5 

3.49 

1 

1 

4 

18 

1.87 

4 

6 

6 

0.98 

1 

2 

6 

19 

2.79 

5 

5 

6 

2.55 

0 

2 

4 

20 

3.24 

1 

1 

4 

3.10 

0 

2 

4 

21 

4.l16. 

3.26(Ava) 

£ 

27 

si 

.^82 

ftfl(Ava) 

£ 

2 

£ 

9 

i 

29 

ibla  1. 

Mean  time  and 

number 

of 

ei : ora 

for  each  aubjact  per  elgengroup 

tabulated  for 

each  angular  reaolutlon  range  per  panal. 

flME  PER  SUBJECT  (SEC) 


LOG  E— 4 LOG  T+3.38 

tA,B.C,  OENOTE  PANEL 
1,2,3,  OENOTE  EIGENGROUP 


MEAN  TIME  PER  EIGENGROUP,  T (SEC) 


Flgurt  4 1 Error  doptndtnoc  on  tho  «oan  Clmt  for  pintlN  avaluated 

ptr  oltongroup,  +40  digroG  roiolutlon  range 
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Th«  ourva  for  panol  B and  panal  C can  ba  axpraasad  aa 

whara  is  tha  total  numbar  of  arrora  j:40*t  par  algangroup  for 

panal'a  B and  Ct  ^ la  tha  naan  tlma  par  algangroup  for  panala  B and  Cj 
and  3 <38  la  a eonatant. 

Tha  alnllar  plota  of  tha  numbar  of  total  arrora  ji20*  par  algangroup 
ara  ahown  In  flgura  S.  Tha  eurva  for  panal  A can  ba  axpraaaad  aa 

log  - -2  log  T^  + 1.86  ' (3) 

whara  E^  20  i*  total  numbar  of  arrora  4^20*  par  algangroup  for  panal  A» 
T^  la  tha  maan  tlma  and  1.86  la  a eonatant.  Tha  curva  for  panala  B and  C 

can  ba  axpraaaad  aa 

®B,C,20  " ^B,C 

whara  tha  tarma  hava  tha  aama  oomparabla  daflnltlona  aa  for  Bq.  (2). 

Tha  plota  of  tha  numbar  of  total  arrora  4^10*  par  algangroup  art. 
ahown  In  flgura  6.  Tha  curva  for  panal  A can  ba  axpraaaad  aa 

log  B^^Iq  ■ -2  log  + 2.35,  (3) 

and  tha  ourva  for  panala  B and  C oan  ba  axpraaaad  aa 

‘b.c.io  ■ ■'>.0  + 

i 

whara  tha  tarma  ara  daflnad  almllarly  aa  thoaa  In  Bqa.  (1)  and  (2). 

Tha  ganaral  aquation  oan  ba  axpraaaad  aa 

log  B ■ -2n  log  T 4>  K (7) 

whara  E la  the  numbar  of  total  arrora  par  algangroup,  n la  tha  numbar  of 
domalna  of  tha  algnlfloand  of  tha  atlmulua  panal,  T la  tha  maan  time  for 
the  total  number  of  trials  for  each  algangroup  par  ayatam,  and  K la  a 
eonatant.  The  ganaral  aquation  and  tha  daflnltlona  of  tha  terms  are 
limited  to  tha  results  and  dlaouaslona  of  tha  above  analyaea  of  the  error- 
tlma  raaponsa  parformanoa  for  a slnglafold  raaponaa. 

5.  aUCliSllOH.  Tha  purpose  of  this  exparlment  was  to  evaluate  tha 
human  factors  of  three  variations  of  a display  panal  by  subjects  with  saro 
(minimal)  bias.  Tha  mathematical  analysis  of  tha  arror-tlme  response 
parformanees  of  groups  of  "unbiased"  individuals  rssulted  In  a naw 
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frcqutney-dlBtributlon  equation.  In  .prder  to  nalntain  clarity  in  daacrib- 
tng  the  experlatntal  deiigne  and  prooaduraii  it  was  neoassary  tb  intreddee 
and  define  niW 'parameters  which  would  be  relevant  to  the  analyses  of  the 
panels  and  interpretation  of  the  equations.  The  three  display  panels  were 
designed  to  perform  the  same  response  function,  but  the  complexity  and 
domains  of  the  signlflcands  of  the  stimulus  panels  were  increased.  In 
partlfiuler,  tha'methbde  6t  the  angular  readouts  were  changed  from  the 
circular  display  of  degrees  in  one  direction  of  one  domain  of  panel  A to 
the  two  circular  seqiientlsl  displays  of  mils  in  the  same  direction  of  two 
domains  of  panel  B,  and  finally  to  the  alphanumeric  readout  of  mils 
genarating  two  aemloircuXar  displays  in  opposite  directions  of  two  domains 
of  panel  C. 

The  subjects,  eeleuted  at  random  for  this  axpariment  ware  considered 
to  be  identical,  but  nut  matched  with  respect  to  knowledge  and  training 
associated  with  the  designs.  (Ranoom  groups  are  comprised  of  subjects 
which  would  be  considerud  repreeantAtive  of  a large  assembly  of  those 
subjects,  whereas  matched  groups  are  defined  as  groups  comprised  of  those 
subjects  whose  evaluated  characteristics  have  bsen  found  to  bs  similar 
within  some  norm  of  a criterion.  Both  groups  can  be  considered  aa  eigen- 
groups  if  they  are  constrained  to  being  nearly  the  seme  state  of  knowledge 
end  training,  and  avaluated  once  and  only  ones  for  one  set  of  trials  for 
sach  of  the  experimental  deoigna. ^ 

The  curves  for  the  three  error  ranges  for  the  panel  having  one  domain 
(panel  A)  can  be  portrayed  by  an  empirieel  equation,  log  E «•  f (log  T> 
with  each  having  the  seme  elope  of  -2.  The  dleplacemsnt  constant  Increases 
from  1.20  to  1.B6  to  2.35  with  increasing  angular  readout  rssolution.  Ths 
curves  for  the  thrss  rsngss  for  the  panels  having  two  domains  (psnsls  B 
and  C)  can  be  portrayed  by  the  same  empirical  aquation  as  above  with  etch 
having  a alope  of  -4.  The  displacemant  constant  increases  from  3.38  to 
3.57  to  3.72  with  increasing  angular  rsadout  resolution.  Since  the  elope 
of  the  general  equation  is  -2n,  where  ii  is  the  number  of  domelne,  end  the 
oonstente,  increasing  with  increasing  angular  resolution  as  well  as  number 
of  domains,  it  appears  that  the  general  equation  is  an  explloit  function 
of  both  the  number  of  domains  of  the  panels  and  the  resolution  of  the 
reeponse  data.  This  impllas  that  the  ganeral  equation  is  independant  of 
the  amount  of  training  of  the  eigengroups.  However,  it  la  logical  to 
expect  that  for  a given  number  of  trial  sate,  the  total  number  of  errors 
per  elgengroup  would  decrease  with  Increesed  level  of  trelnlng.  Since 
it  ie  not  known  how  the  training  would  effect  the  aquation,  if  at  all,  it 
is  aesummed  that  Cha  general  equation  la  an  implicit  function  of  training. 
In  order  that  tha  error-time  reeponse  experiment  to  be  meaningful,  It  Is 
requlrad  for  the  number  of  trials  aete  ba  sufficiently  large  so  that  at 
least  one  error  be  committed  per  trial  sst  for  tech  of  tho  sxpsrimentel 
designs. 


44 


:i  ^ I 


I 


Thi  Inoreaittd  eoaplwcltyi  i»e.|  changing  Pha  aignlfieand  from  a 
cltreular  vapraaantatlon  of  two  domalna  of  panal  B»  to  an  alphanuaarlo 
rapraaantatlon  of  two  domains  of  panal  C had  no  apparant  influanca  or 
daviatlon  from  tha  linaarity  of  tha  eurvaa  raprasanting  thoaa  panala. 
fha  lack  of  daviation  ia  not  unaxpaotad  aa  a raault  of  tha  Oagni  and 
Foatar  <19A9)  atudiaa. 

It  ia  vaaliaad  that  tha  analyaaa  praaentad  harm  are  of  a amall  aample 
evaluation  of  groups  of  Individual  aubjaota.  However,  tha  analyaia  of 
variance  Indioatea  that  trials  of  tha  right/wtong  daciaion-tlmad  reaponaa 
parforoanoaa  on  tha  three  aystams  are  valid  " 5.08;  p<0.001). 

The  analysis  of  variance  for  the  ayatama  (^(2,iS6)  “ p<0.001) 

indicated  that  tha  ayatama  ware  different,  and  that  TrialXSystam 
^^(10,156)  * not  significant.  Soma  learning  did  ''ccur  for  tha 

aubjacta  (six  trials  each),  however,  ^-ha  learning  did  not  interact  with 
tha  ayatama,  and  all  aubjacta  learned  equally  to  about  tha  same  degree. 

From  the  above  discussion,  it  ia  postulated  that  tha  ganaral  equation 
is  valid  for  other  error-time  reaponaa  axparimenta  similar  to  thoaa 
described  in  this  paper.  Efforts  ware  made  to  apply  tha  error-time 
response  data  of  Oaghi  and  Foatar  (1949)  and  Fitts  and  Saegar  (1953)  to 
tha  analysis.  This  was  dona  for  tha  purpose  of  subjecting  Bq.  (7)  to 
axparimental  results  of  other  invaatigatora  for  corroboration.  Tha  error- 
time  reaponaa  aquation  eould  not  be  geiiarated  from  tha  above  souroas  dua 
to  tha  following  reasons i (1)  the  mean  time  was  measured  only  for  tha 
correct  choice  which  included  the  wrong  choieea  until  tha  correct  choice 
was  made;  (2)  the  total  number  of  errors  were  determined  aa  a function  of 
preliminary  and  accumulated  training;  and  (3)  moat  Importantly,  tho  error- 
time  measuremanta  ware  not  made  on  eigengroupa.  i.a.,  thoaa  groups  having 
identical  prior  knowladge  of  the  panala,  end  tha  same  acquired  Iserning 
for  each  aat  of  trials  for  the  entire  aeries  of  trial  seta.  Further 
inveetigational  work  ia  required  to  subject  the  general  equation  to 
experimental  verification. 

6.  SUMMARY.  The  experiment  presented  here  is  similar  to  those 
reported  in  literatuce,  and  the  stimulus-responae  procedures  era  standard 
practices.  It  la  known,  in  general,  that  as  a eubjant  takes  less  time  to 
make  decisions,  considered  to  be  right/wrong,  the  number  of  errors 
increaeea  and  tha  standard  daviation  becomes  larger.  However,  the 
experiment  here  differs  on  two  Important  aspects  with  respect  to  the 
control  of  the  subjects  and  data  analyses.  The  first  is  that  the  subjects 
ware  separated  into  groups  of  eqtielly  biased  knowladge  (no  pretraining) 
concerning  the  panels  and  were  not  subjected  to  a cumulative  learning 
prncees  for  the  entire  eeriee  of  trial  sets.  The.  aacond  aspect  Is  that 
an  artor  was  conaldurad  aa  a discreta  reeponae  of  a right/wrong  deciaion 
and  the  errors  were  analysed  as  a function  of  tha  mean  time  of  the  total 
number  of  decisions  per  eigengroup.  The  analyses  of  the  error-time 


•qufttioni  ncctiiltatad  the  intiroduotloa  of  new  peretteteve  iti  order  to 
unamblKuoualy  define  the  etlmulue  penele  end  interpret  the  prooeduree  end 
reaulta  eonaietent  with  the  equetionti  The  generel  equetlon  ip  e nathe- 
natieal  expreaeion  which , for  thia  experiment,  dnaoribea  the  reletlonahip 
between  the  number  of  eirora  of  rlght/wrong  deeialone  and  the  mean  time 
in  making  the  deolalona. 
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RELIABILITY  ANALYSIS 
OF  AIRFIELD  LIGHTING  SYSTEMS 


Frank  Kuo 
Edward  S.  Lindow 
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Construction  Engineering  Research  Laboratory 
Champaign » Illinois 


TRACT.  The  reliability  analysis  of  a system  with  multiple  types 
oneHs  under  maintenance  Is  a complex  problem.  This  paper  presents 
for  such  analysis  with  specific  application  to  airport  lighting 
systems.  A set  of  consecutive  coefficients  Is  Introduced  to  account  for 
system  failure  criteria  which  Includes  random  light  outages,  consecutive 
light  outages,  and  consecutive  light  bar  failures.  Probability  theory 
.no  simulation  technloues  are  useo  along  with  the  consecutive  coefficlen 


In  determining  system  reliability.  The  computerized  model  has  been  used 


of  parameters  such  as  unit  reliability,  systein  configuration,  maintenance 
strategy,  and  unit  performance  characteristics. 


1.  INTRODUCTION.  Visual  guidance  lighting  systems  for  airports  provide 
neceisaiv  Information  for  aircraft  operation  during  the  approach,  landing, 
takeoff,  and  ground  movement  (taxiing).  In  darkness.  Inclement  weather  or 
other  periods  of  low  visibility,  the  Information  provided  by  these  systems 
is  critical  to  safe  and  efficient  air  travel. 


Although  significant  research  has  been  devoted  to  Improving  comoonent 
equipment  in  these  lighting  systems  and  to  delineating  the  pilot's  Infor- 
mation requirements,  little  has  been  done  to  determine  the  operational 
reliability  of  the  systems  currently  In  use.  Because  these  systems  are 
critical  to  safe  and  efficient  aircraft  operations  and  because  Installation 
and  maintenance  costs  for  such  systems  are  high,  procedures  to  analyze  the 
reliability  of  present  airfield  lighting  systems  are  needed. 

The  purpose  of  the  research  summarized  In  this  paper  was  to  develop 
procedures  for  evaluating  the  functional  reliability  of  airfield  lighting 

systems, 

2.  AIRFIELD  LIGHTING  SYSTEM  MODEL.  There  are  numerous  types  of 
lighting  systems  Involved  TFm” visual  guidance  of  aircraft  traffic.  The 
number  and  the  configuration  of  lights  In  each  system  will  depend  on  factors 
such  as  the  Information  conveyance  requirements,  the  area  to  be  served,  the 
category  of  operations,  and  the  terrain. 


Although  individual  systems  are  comprised  of  specialized  equipment  in 
configurations  designed  to  satisfy  specific  Information  requirementsi  all 
lighting  systems  have  the  common  elements  of  a power  sourcoi  power  circuitry 
and  light  transmission  equipment.  Because  of  these  s1m11arities»  a general 
model  can  be  used  to  define  all  visual  guidance  lighting  systems. 

The  model  developed  for  this  purpose  consists  of  12  types  of  components: 
commercial  power,  auxiliary  power,  control  panel,  control  circuitry,  control 
vault,  reoulator.  primary  cable,  isolating  transformer,  secondary  cable, 
fixture.  Tens,  and  lamp.  Division  of  the  model  into  these  components  con- 
sidered function,  maintenance,  physical  proximity  and  connection,  and  failure 
modes.  Some  of  the  components  include  several  elements  (e.g.»  the  control 
vault  Includes  power  transformers,  relays,  switches,  etc)  while  others  are 
composed  of  a single  element  (e.g..  the  lamp). 

Figure  1 illustrates  the  general  lighting  system  model.  Since  the  number 
of  components  of  each  type  can  be  varied  (or  deleted  if  not  applicable),  this 
model  provides  the  necessary  flexibility  to  define  all  airfield  lighting 
systems. 

By  defining  the  geometry  of  a system,  the  operating  characteristics,  and 
the  failure  criteria,  any  lighting  system  can  be  analyzed  using  this  general 
model. 


System  reliability  is  typically  defined  as  the 
will  perform  its  intended  function  in  a specific 
id  period  of  time.  However,  systems  which  under- 
go constant  maintenance,  as  is  the  case  with  airfield  lighting,  are  composed 
of  equipment  of  various  ages  and  thus  a time  period  can  not  be  realistically 
analyzed.  For  such  maintained  systems,  the  steady-state  reliability,  which 
can  be  interpreted  as  the  probability  of  the  system  being  in  a nonfailure 
state  while  under  operation,  is  significant. 

Figure  2 is  a tree  structure  depicting  the  parameters  which  must  be 
considered  in  analyzing  the  reliability  of  airfield  lighting  systems  in 
the  steady  state.  Essentially  throe  steps  are  required. 

a.  Develop  the  component  reliability  function  for  each  type  of 
component. 

b.  Simulate  the  average  light  unit  reliability. 

c.  Calculate  the  system  reliability  by  applying  the  system  failure 
criteria. 

Thus,  the  reliability  model  includes  both  deterministic  and  stochastir. 

?arameters  which  must  be  combined  by  using  analytic  and  simulation  procedures, 
he  following  sections  summarize  the  procedures  employed  in  the  three  steps 
of  the  model . 


probab i 1 ny  tnat  a system 
environment  for  a specifie 
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4.  rellftbliny  of  Mch  componont  type  In 

the  genei^ei  lighting  syitem  model  can  be  approximated  by  an  exponential 
distribution  over  the  component's  design  life.  This  distribution  is  defined 
by  Eq  1 and  illustrated  in  Figure  3. 


R (t)  a e CEq  1] 


Figure  3,  Component  Reliability  Distribution. 


Determination  of  the  failure  rate  (X)  for  each  component  in  the 
system  is  quite  complex  when  maintenance  ond  operation  practices  are  con- 
sidered. Full-scale  tasting  of  lighting  systems  to  determine  failure  rates 
would  be  very  expensive  and  time  consuming,  while  accelerated  testing  of 
systems  or  individual  components  introduces  inaccuracies.  Thus,  field  data 
on  system  performance  are  the  best  source  of  information  for  determining 
a component's  reliability  function. 

Considering  the  field  data  anticipated  to  be  available,  the  reliability 
function  for  each  component  in  the  lighting  system  can  be  expressed  by: 

(t-t  )C 

R(t)  « I ^ [Eq  2] 


where  C^  ■ the  coefficient  of  failure 

C«  ■ the  coefficient  of  maintenance 
m 

t ■ the  safety  time  (i.e..  the  period  of  time  when  the  component 
* is  known  to  have  no  chance  of  failure). 


B1 


Th*  co«fflcltnt  of  falluire  (C-)  for  wch  component  Is  computed  from 
Eq  3. 


^ ^The  coefficient  of  maintenance  {C^)  for  each  component  Is  computed 


^m1 


In  In  Rj) 

TrnYf.-gto 


[Eq  43 


from 


where  t|__  ■ design  life  of  component  1 
tj  ■ safety  time  for  component  1 

• average  rel lability  for  component  1, 

“ 1- 

where  ■ failures  per  year  for  component  1 

Ng  ■ total  number  of  component  1 In  the  system 
Tj  ■ average  downtime  for  component  1 (hours) 

T ■ operation  time  per  year  (hours) 


Utilising  these  relatlonshlpsi  Eq  2 can  then  empirically  account  for 
preventive  maintenance,  corrective  maintenance,  and  failure  rate,  Preventive 
maintenance  (PM)  considers  the  component's  design  life,  replacement  time 
(I.e..  that  period  preceding  the  design  life  when  group  replacement  Is  under- 
taken), and.  Indirectly,  the  level  of  PM  activities  (1,e.,  the  more  PM 
performed,  the  lower  the  failure  rate).  Corrective  maintenance  Includes  the 
time  to  detect  a failure  and  the  time  required  to  perform  repairs.  The 
failure  rate  Is  the  annual  number  of  failures  of  that  component  type  In  a 
system  due  to  all  failure  modes  (e.g.,  wear-out,  human  error,  etc.). 
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6.  UNIT  RELIABILITY.  Onc^  the  Individuel  component  reliebnitlei 
hive  been  dtierii nidi  tney  cm  be  combined  to  obtiln  i unit  rellibllity  uilng 
Eq  6.  The  unit  rellibllity  (Ru)  1i  defined  ii  the  probiblllty  thit  i 
rindomly  ohosen  tingle  unit  in  the  eyitem  will  be  eperitlona!  when  oiiied 
upon  to  perform.  The  unit  It  eenipoied  of  one  of  eieh  component  type  In 
the  generil  lighting  model  it  depicted  In  Figure  4. 


\ • n - C1-Rl  (t^)]  Cl-Rg  (tj)]}  • Rg  (tg)  ‘ R^  (t^)  • Rg  (tg)  • 

Rg  (tg)  * Rj^  (t»f)  ’ Rg  (tg)  ' Rg  (tg)  ‘ R^q  (^10^  ' ' ^12  ^^12^ 

To  determine  the  iverige  unit  ri11ib111ty»  a Monte  Carlo  ilmulatlon 
routine  wai  developed  to  itochistically  account  for  the  time  function  and 
lyitem  geometry  faetora.  That  1ii  the  component 'i  reliability  1i  actually 
a function  of  time  and.  In  the  iteady-state,  the  component's  reliability 
may  be  at  any  point  of  time  on  the  function,  in  addition,  the  system 
geometry,  or  the  number  of  each  component  In  the  system,  will  also  Influence 
the  average  unit  reliability.  The  routine  used  Is  Illustrated  in  Figure  5. 
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For  «ach  componont.  1i  select  j 
random  times  (t^  4)  such  that 
0 < t,  4 < t, 

where  t ■ replacement  time  for 
component  1 

J • number  of  component  1 
In  the  system 


From  the  Individual  component 
reliability  functions,  determine 
the  component  reliability  for 
each  t^  J 


Combine  component  reliabilities 
(R^^j)  usings 

R,  i n-(l-RT.j)(l-R2,j)]  R3.J 

■^4,4  ^5,J  '^6,J 

'^9,J  *^10, j '^n.j 


to  provide  a unit  system  for  each 
llgnt. 


Figure  5.  Simplified  framework  of 
average  unit  reliability  simulation 
routine. 


6,  SYSTEM  RELIABILITY.  An  airfield  lighting  system  fails  when  it  does 
not  accurate ly  trinsmit  the  information  required  by  a pilot  for  safe  operation 
of  an  aircraft.  Since  pilot  perception  is  involved,  system  failure  is  sub- 
jective in  nature.  Through  research  studies,  the  FAA  has  established  objective 
failure  criteria  which  provides  minimum  operating  standards  for  each  type  of 
lighting  system. 


K.  I 


In  defining  failure  criteria,  the  airfield  lighting  systems  have  been 
categorized  as  linear  and  bar  systems.  The  linear  system  criteria  stipulates 
the  percent  of  random  outages  and  the  number  of  consecutive  outages.  The 
bar  system  criteria  stipulates  the  percent  of  random  outages,  the  number  of 
outages  in  a bar  creating  bar  failure,  and  the  number  of  consecutive  bar 
failures. 

Using  the  appropriate  failure  criteria  and  the  average  unit  reliability, 
the  reliability  of  the  lighting  system  can  be  determined  from  Eq  6 for  both 
categories  of  systems. 


Rj,  - S Ry""''  (1-Ry)^ 
1-0 


[Eq  6] 


where  R,  ■ the  system  reliability 

S 


Ry  ■ the  unit  reliability 
n « the  total  number  of  lights  in  the  system 


i ■ the  number  of  light  failures  in  the  system 

W.  ■ the  number  of  ways  i failures  can  occur  in  a system  of  n 
' total  lights  without  the  system  reaching  failure  by  either 
the  random  or  the  consecutive  'allure  criteria. 


The  following  example  illustrates  the  application  of  this  equation. 

The  example  Involves  finding  the  system  reliability  for  a three-lamp  system 
(n-3)  with  system  failure  defined  as  all  three  lamps  out  or  two  consecutive 
lamps  out.  The  probability  of  a lamp  being  on  is  Rn.  TaETe  1 shows  the 
eight  possible  conditions  in  which  this  system  can  be;  three  are  failures 
and  five  are  successes. 
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Thft  system  rel lebl 1 1 ty  Is 


Rj  • t Ry"-''  (1-ky)^ 

1«0 

R,  ■ IR^®-®  (l-Ru)'  + 3 r/-'  (1-R„)’  + 1R„  3-2  „.n^)i 
+ or/-^  (1-R^)^ 


TABLE  1 


Possible  Conditions  for  Example 


Lamp  1 
Lamp  2 
Lamp  3 


S « Success 
F « Failure 
0 <•  Light  operating 
X • Light  failed 


■ 1 


- 3 


- 1 


- 0 


In  a linear  lighting  system*  If  the  consecutive  failure  criterion  Is 
not  considered*  Eq  6 reduces  to  a binomial  distribution  or 


NR  - - i 3 

Rg  - I (J)  Ry"“^  (1-Ry)^ 
1-0 


[Eq  7] 


where  NR  ■ number  of  random  failures  allowed  In  the  system. 


njf%to.y  ^ i^;t  t*-s#rt-.-  -V  - ^•  -- : c;  r ^ r^V.*  • 


to  eonildof  conseeutivensss  as  well  as  trandotn  outages  in  the  tai lure  criteria! 
an  ana1it1ca1  procedure  has  been  developed  to  compute  each  N^,  Since  W. 
is  a multivariate  Inteoer  function  of  n,  NC,  and  1 (where  NC^-  number  of 
consecutive  faiiures  allowed  and  n and  1 as  previously  def1hed)i  there 
Is  a unique  constant  for  each  (n»  NC,  1)  which  Is  defined  here  as  the 
consecutive  coefficlenti  C(n,  NC,  1).  This  coefficient  Is  the  number  of 
ways  that  i outages  can  be  distributed  In  n total  lights  without  having 
moro  than  NC  consecutive  outages.  Substituting  the  coefficient' in  Eq  6 
produces: 


Rj  - I 


(1-Ru) 


[Eq  8] 


(Note  that  the  summation  Is  from  1»0  to  1»NR  since  W,  goes  to  zero  when  the 
number  of  outages,  1.  exceeds  the  allowable  random  cutages,  NR). 

An  automated  procedure  Is  used  to  compute  the  consecutive  coefficients 
based  on  the  following  recursive  function: 


C(n,  NC,  1)  - C(n-1,  NC,  1)  + 

C(n-1,  NC,  1-1)  - c(n-NC-2,  NC,  1-NC-1) 


[Eq  9] 


The  derivation  and  development  of  the  program  may  be  found  elsewhere. 

The  method  for  analyzing  the  bar  lighting  systems  Is  similar.  However, 
the  determination  of  W.  Is  much  more  complex  due  to  the  nature  of  the  bar 
system  failure  criteria.  A detailed  descr'-ytlon  of  the  bar  sycten  analltical 
technique  Is  given  In  the  project  final  report. 1 


Lindow,  E.  S.  and  Kuo,  F.  "Reliability  Analysis  For  Airfield  Lighting 
Systems"  Final  Report  for  Contract  D01-FAG6WAI-118,  CERL,  September  1976. 
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I'  The  reliability  methodology  summarized  1n  the 

previous  itaTeM  be  d1  f f 1 cul t to  epply  manuiny  when  considering  the 
number  of  lights  in  a systeirit  the  stochastic  properties  of  the  component 
rellabllitlts,  and  the  sophistication  of  the  failure  criteria.  Thus,  the 
procedures  have  been  comouteriied  in  the  RAALS  (Reliability  Analysis  of 
Airfield  Lighting  System)  program.  This  program  Is  capable  of  efficiently 
estimating  the  functional  reliability  of  any  lighting  system  used  in  the 
visual  guidance  of  aircraft.  Flexibility  Is  provided  In  the  program  to 
consider  various  system  configurations  and  failure  criteria  as  wall  as 
different  comportfint  failure  rates,  design  lives,  and  levels  of  maintenance. 

Figure  6 Is  a simplified  flow  chart  of  the  RAALS  program.  Figure  7 
presents  the  Input  data  listing,  a typical  component  reliability  function, 
and  the  system  reliability  output  resulting  from  an  example  problem. 


CyiCLUSIONS.  The  automated  procedure  for  analyzing  reliability  of 
airfield  lighting  systems  (RAALS)  Is  an  Implemantable  tool  which  can  be 
used  to: 


a.  Compare  the  reliability  of  similar  systems, 

b.  Determine  where  a system  should  be  improved  to  increase  Its 
reliability, 

c.  Form  a basis  for  decisions  on  implementing  changes  to  failure 
criteria,  equipment,  or  maintenance  policies, 

d.  Monitor  the  reliability  of  a system  as  It  becomes  older  or  as 
modifications  are  Installed. 

The  RAALS  program  logic  Is  based  on  traditional  reliability  theory. 
However,  due  to  the  number  and  complexity  of  lighting  systems  and  the 
necessity  to  consider  consecutiveness  in  the  failure  criteria,  original 
analytical  techniques  were  developed  and  Interfaced  with  traditional 
theory.  These  techniques  Included: 

a.  Formulation  of  a general  lighting  system  model  capable  of  con- 
sidering all  of  the  diverse  equipment  ana  geometry  encountered  In  airfield 
lighting 


b.  Adaptation  of  a Monte  Carlo  simulation  routine  to  the  analysis 
to  account  for  the  stochastic  nature  of  the  component  reliabilities 

c.  Derivation  of  the  consecutive  coefficient  to  consider  consecutive- 
ness In  the  system  failure  criteria 
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d.  Developtnint  of  an  aniaytical  pnocadure  to  detafinina  systam 
ra11eb111ty  which  accounts  for  tha  oparatlon,  maintananca,  and  fallura 
variablas  of  each  eomponant 

a.  Automation  of  tha  combined  procedures  Into  a concise,  efficient 
computer  program. 

Although  this  research  effort  was  devoted  to  airfield  lighting  systems, 
the  methodology  developed  Is  applicable  to  any  system  which  can  be  similarly 
defined  and  for  which  failure  criteria  stipulate  consecutive  failures 
as  well  as  random  failures. 
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ABSTRACT.  Thla  paper  daala  with  a aimpllflad  method  of  datarmining  the 
approximate  lower  oonfidanoa  bounda  on  reliability  of  a ayatam»  given  the 
lyitam  poatarlor  reliability  beta  paramatera  A'g  and  B*g  (integer  or  non- 

Intagar)  and/or  triala  and  failuraa  obaarvad  and  the  interval  daairad.  Prior 
to  the  development  of  thla  method » a eomputei  waa  utiliaed  to  determine  the 
lower  bounda  due  to  the  fact  that  the  bate  paramatera  werei  for  the  moat  parti 
non-'lntagar.  The  method  deaoribed  in  thia  paper  waa  empirically  developed  and 
providea  a method  of  determining  approximate  reliability  bounda  very  aimply 
with  the  uae  of  a SR  Si,  HP  43  ate.,  hand  calculator.  The  unaolved  problem 
aimply  atatad  la  "Why  doea  the  method  work  aa  well  aa  it  doaaT" 

1.  INTRODUCTION.  A need  eroae  in  ABMCOM  for  e almpllfled  method  of 
determining  approximate  lower  bounda  on  reliability,  given  aubayatem  data, 

a model  and  tha  confidence  Interval  daairad.  Aa  a reault,  a literature  aearoh 
waa  made  of  current  available  methode.  Theae  methoda  are  referred  to  by  com*' 
parlaon  in  our  paper  titled,  "Confidence  Llmita  for  Syatam  Reliability  When 
Taating  Takea  Place  at  the  Component  Level,"  dtd  31  Oct  75.  Baaed  ou  the 
review  of  the  current  available  mechoda,  it  waa  decided  to  aee  if  a more 
almpllfled  method  could  be  developed  which  would  overcome  aome  of  the  ahort- 
oominga  of  tha  current  methoda  and  atill  provide  reaulta  which  would  aatlafy 
our  neada.  A method  waa  developed  ae  doacrlbed  in  reference  paper}  however, 
the  mathematical  expreeaion  derived  amplrioally  for  calculating  the  lower 
bound  ia  atill,  to  thia  day,  not  fully  underatood. 

2.  THE  LOWER  BOUND  ON  RELIABILITY.  The  lower  bound  on  reliability  ia 
determined  aa  followai 

Given  I Ag,  Bg  ayatem  poaterior  reliability  paramatera  of  a beta 
1 - a ■ Confidence  Interval  deaired 
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A fractional  chl^aquara  table  le  requited)  however » linear  interpolation 
can  be  utllieedt 

4 1 PROBLEM.  An  understanding  of  the  expression! 
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Is  needed  in  order  to  provide  an  answer  to  the  many  inquiries  conoernlng  the 
nathematiaal  validity  of  the  above  sxpraesion. 

5 . RKBULT5 . Many  values  of  A3  end  Bg.  both  integer  and  non-integer » 
were  compared.  The  values  shown  are  just  a few  of  the  eonparisona  made. 
Other  comparieons  at  different  confidence  intervals  wars  mads  as  shovm  in 
Table  1 through  4. 

For  whatever  help  it  mey  be.  the  relationship  between  the  F distribution 
and  the  expression  was  found  to  bsi 
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From  thli  •xpraialon,  approxlaata  valuta  o£  tht  F diatrlbutlon  can  bt 
obtained  for  non-integer  degreea  of  freedoa. 
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LOWER  LIMITS  OF  SOX  COMFIDEMCS  INTERVAL 
(BINOMIAL  DATA) 
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EVALUATIOM  OP  QUNNER  ERRORS  THROUQH  TIME  SERIES  ANALYSIS 


Latrloha  Oraane  and  John  Howarton 
Syatams  Evaluation 
Aaroballlstlos  Dlreotorata 
US  Army  Mlaaila  Raaamreh  and  Davalopment  Command 
Radstona  Araanali  Alabama  3S80S 


ABSTRACT 

Thia  paper  dasorlbaa  a procedure  used  at  the  Army  Missile  Command 
(primarily  with  command  to  line  of  sight  systems)  for  modelling  man 
In  the  loop.  The  model  developed  here  with  Its  parameters  can  be  used 
to  simulate  data  or  to  drive  a total  systems  simulation. 

The  procedure  outlined  here  was  developed  initially  by  L.  Greene, 
J.  Howerton,  N,  Rich,  and  M.  Wise  of  the  Army  Missile  Command  In 
conjunction  with  M.  Yang  from  the  Unlveraity  of  Florida  for  the  optical 
mode  of  Air  Defense  Systems  in  which  a man  was  used  to  track  the 
target.  Currant  plans  call  for  using  this  same  technique  to  evaluate 
tracking  radars  during  an  ECM  environment. 

The  analysis  of  the  original  work  as  described  here  was  concerned 
only  with  stationary  data. 
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1.  Introduction 

Prediction  of  the  airiount  of  error  due  co  gunner  treeking 
of  a moving  target  Is  an  important  phase  in  the  development  of  weapon 
ny^tf'-ms.  Data  of  this  type  occur  In  the  form  of  time  series.  The 
observations  are  dependent  and  the  nature  of  this  d.'oandenoe  ie  of 
uLiiiCiit  importance. 

The  purpose  of  this  paper  is  to  present  a method  for  evaluating 
gunner  error  data  described  below,  the'reby  defining  a time  series 
model.  This  model  and  its  parameters  can  be  used  to  simulate  data  for 
future  problems  of  a similar  nature  or  may  be  used  as  a subroutine  to 
missile  flight  simulation. 


2.  Data  Description 

The  initial  tests  to  determine  the  gunner  tracking  error 
characteristics  were  conducted  at  Redstone  Arsenal  during  the  period 
13  through  18  July  1972.  The  King  Air,  a twin  engine  Beechoraft,  was 
the  target  utilized  for  these  tests. 

A 16mm  film  camera  was  attached  to  the  monocular  output  of  the 
tracker  unit.  This  output  presents  the  same  view  to  the  film  camera 
iis  the  binocular  output  presents  to  the  gunner. 

There  were  four  gunners  who  partloipatsd  in  the  testa.  They  wars 
liiFitructed  to  track  the  centroid  to  the  target  aircraft  whan  details 
worH*  not  resolvable.  When  resolvable  they  were  to  track  tha  inter- 
section at  the  wing  and  fuselage.  The  amount  of  error  was  shown  to  ba 
Independent  of  individual  gunner,  that  is,  there  was  no  statlstioal 
significance. 


3.  Model  Building 

This  section  dlaousses  the  time  series  model  building  for  the 
gunnei's'  error  data.  After  examining  all  the  data  available,  we  conclude 
that  the  data  forms  a stationary  time  series  except  at  the  beginning 
where  a tran.sient  occurs,  during  acquisition,  and  at  the  end  where  a 
transient  is  introduced  by  tha  simulated  missile  in  flight  signal. 

Rtinia  with  too  few  data  were  eliminated.  The  total  number  of  runs  was 
thoti  143.  ft  few  nonstationary  data  can  also  be  seen.  They  occupy 
13,  R9  percent  of  ttio  total. 

Wlieti  the  data  are  recorded  with  equally  spaced  time  intervals,  wa 
Rpnor.Tliy  uae  a linear  time  series  model  to  fit  the  data.  A commonly 
uimd  modc'i  lor  univariate  time  series  can  be  written  as 
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(1.1) 


Vj  - U « - U)  + »2C^t-2  “ ♦p<^t-p  " 

*t  “ ®lS-l ®q*t.q 

whtr« 

•ubsorlpt  t B eime 

■ th«  v«lu«  of  the  time  aeriae  at  time  t 
|i  a the  expected  value  of 

e^  a a white  noise  process,  l.e.,  a^  Is  Independent, 

2 

identically  distributed  N(0,  o^) 

p,  q ■ two  parameters  depending  on  the  properties  of  a 
particular  time  series. 

Model  (1.1)  ia  called  a mixed  model  with  autoregreasiva  and  moving 
average  components.  It  has  been  widely  used  in  practice  with  fruitful 
results  (see  e.g.,  Box  and  Jenkins  (1),  Fuller  and  Tsokos  [2],  Cleveland 
[3,  4],  and  Box  at.  el  [S]).  The  intuitive  idea  behind  the  model  (1.1) 
is  the  assumption  that  the  present  value  depends  on  the  valueo  of 

in  the  near  past,  l.e,,  ...,  I-®  the  autore- 

gressive component 

(Yj  ■ u)  ■ - u)  + ♦2^^t-2  “ + *p^^t-p  ” 


The  moving  average  component  a^  - ®q*t-q  that 

the  present  value  y„  depends  not  only  on  the  present  noise  a^,  but  alao 
the  previous  nolss  a^_j^,  ...,  This  in  raasonable  since  the  noise 

will  npt  diminish  very  rapidly  in  real  situations.  The  noise  prolongs 
its  influence  on  Yj.  for  e certain  period . 

In  practice  whan  time  acriea  data  art  given,  a model  of  the  form 
(1.1)  can  generally  be  built.  The  dateilod  procedure  has  been  given  in 
Box  and  Jenkins  [1] . There  ace  four  main  ateps. 

a.  jagAal..idtstif,|i^jLaiL 

In  thia  fir at  stap,  autocorrelation  coefficients,  partial 
autocorrelation  coefficients,  and  invoraa  correlation  coefficients 
(e.g.,  Cleveland  { 3] ) are  used  to  detormino  the  values  of  p and  t|  in 
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modal  (1>1)>  Tha  value  p la  called  the  order  of  the  autoregreaaive 
component  and  the  value  q li  called  the  order  of  the  twvlni  averagi 
component  in  a mixed  model  (1.1) • 

b.  Parameter  Estimation 

After  the  valuea  of  p and  q have  been  determined,  there 
are  p q f 2 parameterei  u,  ^2  *“>  ^p*  

variance  o^  of  a^  to  be  determined.  The  method  used  to  estimate  s's 

and  S's  has  been  described  in  Box  and  Jenkins  (Chapter  7,  [1]), 
Clevcnsen  [6],  and  Parsen  [7]..  The  main  technique  is  the  maximum 
likelihood  estimation.  Generally,  the  calculation  needs  the  help  of 
spectral  density  estimation  [7]  or  nonlinear  least  squares  estimation 
(11. 


The  estimated  values  Q,  £,  £,  and  of  the  parameters 
Hi  Jgi  respectively,  are  not  generally  equal  to  the  real  value 

of  these  parameters.  The  model  with  estimated  parameters 


«t  - « ■ -a*  ...*  S Cl  - D)  ♦ 


q t-q 


(1.2) 


may  not  fit  the  original  data  well.  Diagnostic  chocking  determines 
whether  our  estimated  model  fits  the  data  well.  The  residual  process 
[d^]  is  examined.  If  the  (t^]  is  close  to  a white  noise  process,  the 

model  is  considered  to  be  adequate  and  the  whole  model  building  procedure 
is  over.  Otherwise,  we  go  to  the  next  step. 


If  the  model  we  built  is  found  inadequate  through  the 
diagnostic  checking,  we  will  try  to  fit  the  data  by  a now  modified 
modal.  Generally,  the  residual  process  (t^]  will  reveal  some  Information 

on  how  the  model  should  be  rebuilt.  Zn  most  oases,  a pair  of  new  values 
of  end  q will  bo  obtained.  Using  those  new  values  of  p and  q,  vs 
undergo  steps  b.,  e.,  and  d.  for  this  new  model  building. 

All  the  four  steps  have  bean  carefully  followed  for  building  the 
gunners'  error  data  model.  F ^r  the  (apparently)  stationary  time  scries, 
with  esimuth  and  elevation  both  counted,  the  total  number  of  realisations 


1 


was  248.  Each  time  aarles  of  aclmuth  and  slavatlon  Is  run  separa<:flly 
(Tablas  1 and  2).  Sixty-two  percent  of  the  stationary  series  can  be 
fitted  well  by  a third  order  auteragreseive  proaeee  [p  » 3,  q ■ 0 in 
model  (1.1)),  i.e.i 

H - h^t-1  ‘ *2^^t-2  “ + *t  * 


<1.3) 

A few  data  can  not  ba  fitted  well  by  (1.3);  they  are  fitted  by  a mere 
Qomplieatad  model.  These  models  and  their  percentages  of  the  total 
data  are  given  in  Table  1.  Due  to  the  biological  and  payeholegioal 
differences  among  gunners,  there  are  variations  in  these  parameters. 

The  means  and  variances  of  these  parameters  are  also  given  in  Table  1. 

TABLE  1.  QtnniBR'S  ERROR  MODEL  FOR  AZIMUTH 


General  model  (3rd  order  autoregreasivo  process)  (62.90%) 


u 

•l 

♦2  *3 

"a 

Mean  0.0393 

0.4489 

0.2362  0.1243 

0.0128 

Variance  0.0108 

0.0170 

0.0066  0.0087 

0.0001 

Special  Model 

Mean 

Variance 

1)  *4^0 

(17.74%) 

0.1490 

0.0050 

2)  Cj  « 0 

(4.84%) 

0.0962 

0.0076 

3)  Cj  * 0 

(5.65%) 

0.0485 

0.0189 

4)  a 0 

(3.22%) 

0.1218 

0.0060 

3)  tg  « 0 

(0.81%) 

0.1844 

0.0000 

6)  a 0 

(2.42%) 

0.0196 

0.0142 

7)  a 0 

(1.61%) 

0.0368 

0.0145 

8)  s 0 

(0.81%) 

0.0970 

0 
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Gentr«l  model  (3rd  order  autoregreeelve  prooeea)  ,(60i4S%) 


M 

h 

$2  ’ ^3 

®a 

Mean  -0.0515 

0.3692 

0.2165  0.1448 

0.0045 

Varlanoe  0.0124 

0.018B 

0.0051  0.0057 

0.0001 

Special  Model 

Mean 

Variance 

1)  a 0 

(15.32%) 

0.1535 

0.0014 

2)  Sj  0 

(8.87%) 

0.1255 

0.0088 

3)  *6^0 

(7.25%) 

0.1125 

0.0039 

4)  a 0 

(4.84%) 

0.1221 

0.0087 

5)  a 0 

(0.81%) 

0.1497 

0 

6)  a 0 

(0.81%) 

0.1294 

0 

7)  a 0 

(0.81%) 

0.1178 

0 

8)  ^i^  * 0 

(0.81%) 

O.P573 

0 

A question  srloei  whether  the  eilmuth  error  end  elevation  error  | 
are  dependent  on  each  other  during  a gunner's  aiming . The  data  shew  \ 
that  wo  can  consider  the  aslmuth  error  and  elevation  error  to  be  two  j 
independent  proceseeai  The  following  procedure  Is  followed • 


A general  model  describing  the  relation  between  two  time  series  Is 
a linear  transfer  function  model . Let  be  the  time  series  of  aslmuth 

and  be  the  time  series  of  elevation.  A linear  transfer  function 
t 

model  can  be  written  aa 

Oft  " dy)  “ fliCVt.i  - 4y)  + ...  + - Uy> 


My)  + 


- ^ «t 


+ Pi  Oft.  I 


M 


I a I 


* ^<^.n 


where 


hy-  80ft) 

e nolte  proeeae 

m,  n ■ th«  number!  of  peat  valuaa  of  X.  and  Y.  on  which  tha  preaant 
Yj  dapanda.  c c 

IntuCltlvaly,  modal  (1<4)  Indlcataa  that  tha  praaant  aalmuth  value  Y^ 

may  depend  on  tha  pravioua  valuaa  of  both  aalmuth  and  elevation.  Thla 
modal  haa  bean  uaed  In  many  practical  altuationa  and  glvea  good  raaulta 
(aee  a.g.i  Box  and  Jenkina  [1]).  Since  wa  have  already  found  a good 
modal  for  Y^  in  tha  pravioua  modal  building! , wa  may  combine  the  Y 

modal  and  (1.4)  and  have 

•t*  ■ “«)  ♦ ♦ Wn-  “«>  *». 

where  a^  la  tha  noiaa  procaaa  from  tha  modal  of  Y^.  Sinea  a^  la  a 

white  noiaa  procaaa,  tha  valuaa  3* a can  be  aaaily  aatimatad  (Box  and 
Jankina  [1]  p<  380). 

An  attempt  haa  bean  made  to  fit  all  tha  odrraaponding  paira  of 
aalmuth  error  data  and  alevatlon  error  data  by  model  (1.4).  Except  for 
a few  exception!  (1  percent  of  the  total),  tha  3 valuaa  are  vary  amall 
(laaa  than  0.03  for  all  Bj*  >**•  • Hence,  wa  eonaidar  that  the 

error  in  elevation  haa  no  algnificant  Influence  on  that  In  aalmuth.  A 
almllar  model  fitting  by  replacing  X by  Y and  Y by  X in  (1.4)  haa  alao 
bean  run  for  ell  pairs  of  data.  An  Indapondanca  relation  is  also 
obtained  hare.  Hence,  we  conclude  that  there  la  no  significant 
dspandenea  between  aalmuth  error  and  elevation  error. 


4.  Simulation  Hrooedure 

In  order  to  simulate  the  total  performanoo  of  a guided  mlsalla 
system  with  a man  in  the  loop,  we  may  uae  the  gunner  * a model  degerlbed 
In  the  previous  Beetlon.  Considering  the  nonrepcatablllty  of  man's 
reactions,  it  must  be  realiaed  that  for  any  eingla  simulation  the  error 
model  will  not  give  tha  aama  results  as  given  by  man.  However , man's 
behavior  on  the  avarago  should  agree  with  that  of  the  error  model. 


simulation  of  a gunnar'a  bahavlor  may  ba  parformad  aa  follows; 


1 

I 


a)  Ohoeaa  2 random  numbara  and  y^  in  [0,  1]  . la  uaad  to 
oonatruct  aalmuth  arror^  t£ 

[0,  0.6290],  a third  ordar  autorugraialva  moddl  will  be 
uaad, 

a fourth  order  autoregreaaivf  modal  with 
6^  a 0 will  be  uaad, 

a fifth  order  autoregreaalve  model  with 
a 0 will  be  uaad, 

a aixth  order  autoragreaalva  modal  with 
a 0 will  ba  used, 

a aevanth  ordar  autoregreaalve  model 
with  a 0 will  be  used, 

an  eighth  ordar  autoragraaslva  modal  with 
a 0 will  ba  uaad, 

a ninth  order  autoragreaalva  modal  with 
a will  be  uaad, 

a tenth  order  autoregreaalve  model  with 
a^Q  a 0 will  be  uaad, 

7,«  [0.9920,  1.00],  aneieventh  order  autoregreaalve  model  with 

0]^^  a 0 will  ba  uaad. 

Thua,  we  have  ehoaen  a model  for  aalmuth  error  prooeae.  7,  la 
uaad  to  oonatruct  elevation  error,  if  ^ 

7-c  (0,  0.6048],  a third  order  autoregraaaiva  model  will  be 
uaad, 

7JC  [0.6049,  0.7360],  a fourth  order  autoregreaalve  model  with 

a 0 will  ba  uaad, 

72«  (0.7381,  0.8467],  a fifth  order  autoregrnaaive  model  with 

e 0 will  bo  used, 

738  [0.8468,  0.9192],  a atxth  ordar  autoregreaalve  modal  with 

Og  a 0 will  be  uaed, 

7jr  [0.9193,  0.9676],  a aevanth  order  autoregreaalvo  model 

with  a 0 will  ba  uaad. 


[0.6291, 

0.8064], 

[0.8063, 

0.8348], 

[0.8549. 

0.9113], 

[0.9114, 

0.9433], 

[0.9436, 

0.9316], 

[0.9317, 

0.9738], 

[0.9739, 

0.9919], 

[0.9920, 

1.00] , an 
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y.c  [Oi9677|  0.9757],  an  alghth  ordar  autoragraaalva  model  with 

0 will  be  used, 

79C  [0.9756,  0.9838],  a ninth  order  autoregressive  model  with 

Og  a 0 will  be  used, 

[0.9839,  0.9919],  e tenth  order  autoregressive  model  with 

* 0 will  be  used, 

[0.9920,  1 .0] , an  eleventh  order  autorsgveseive  model  with 
a 0 will  be  used. 

b)  Use  normal  random  number  generator  to  generate  the  required 
2 

parameters  m,  e's,  and  0^. 

e)  Using  a polynomial  root  solver,  check  the  roots  of  - '^1^'^“^ 
....  - M 0,  If  any  of  the  roots  is  greater  than  or  equal  to  1,  discard 
this  sat  of  e's  and  select  another  group  of  parameters. 

d)  Let  X^  denote  the  aslmuth  error  process  and  denote  the 

elevation  error  process.  Then  according  to  the  models  and  parameters 
chosen  by  steps  a)  and  b),  we  can  simulate  X„  and  Y^  conssoutivaly 

C jC 

by  generating  normal  random  derivatas  a from  M(0,  0.). 

C A 

e)  If  the  perfect  aim  of  a gunner  at  time  t Is  (A^,  E^) , then 

our  simulated  coordinate  of  a gunner  at  time  t Is  (A^  X^,  E * Y^). 

A limulatlon  example)  sett 
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Tuture  Use 


Although  th«  original  work  daait  mostly  with  a stationary 
set  of  data,  there  is  no  reason  why  this  technique  could  not  be  used 
with  non-statlonary  data  simply  by  using  the  difference  equations  as 
outlined  in  Tima  Series  Analysis,  Forecasting > and  Control . by  Box 
and  Jenkins.  — — — 

A study  ia  underway  to  evaluate  the  KOLAND  Air  Defense  System 
during  ah  ECM  environment.  This  ia  a vary  oritloal  area  and  one  that 
so  far  has  not  been  investigated  with  a systematic  quantitative  approach. 
The  approach  offered  here  would  be  valid  regardless  of  the  type  of 
engagement  (optical  or  radar).  Simply  statadi  A sariee  of  target 
tracks  are  carried  out  and  a time  series  model  is  built  of  the  resulting 
radar  errors  aa  a function  of  FCM  and  other  parameters. 

The  final  output  of  this  study  would  be  a pomputer  program  (or 
aubroutlne  integratad  with  the  weapon  system  simulation)  that  oould  be 
used  for  predicting  end  game  reeiults  as  a function  of  different  types 
of  ECM  throughout  tha  ROLAHO  system  engagement  boundary. 

The  baaio  data  needed  to  build  the  proposed  model  oomes  from  a 
video  oamera  bore-sighted  to  the  traok  radar.  An  investigation  of 
tha  advantages  of  putting  a missile  beacon  on  the  target  is  being 
conduoted  at  this  time. 
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A METHOD  FOR  DETERMINING  PAIRWISE 
CONTRASTS  PROM  A FRIEDMAN  TWO-WAY  LAYOUT 
BASED  ON  A THEOREM  BY  MARASCUILO 

Jlmla  C.  DeLoaeh  and  Bugana  f.  Dutelt 
Unltad  States  Azmy  Infantry  Center 
Fort  Banning r Georgia  31905 


1.  INTRODUCTION. 

The  authors  wish  to  axpreas  their  appreciation  to  the  US  Army  Raaaarch 
Office  and  the  Clinical  panelists  at  the  Twanty-seoend  conference  In  the 
Design  of  Bxpariments  for  their  valuable  oononenta  about  this  problem. 

In  raoent  years  there  has  been  an  Increased  effort  to  produce  more 
and  more  non-parametrio  statistical  tests.  These  tests  have  had  broad 
based  applications  in  education  and  psychological  research  and  to  some 
extent  In  military  testing  and  eval<$«tion  of  new  produote  and  training 
methods . 

The  value  of  such  non-parametrio  t^sts  is  well  known.  Although  it 
is  not  the  purpose  of  this  paper  to  demonstrate  the  ueefulnees  of  these 
tests I it  is  worthwhile  to  restate  one  of  the  more  salient  features  of 
non-parametric  teats  and  that  is  the  fact  that  they  do  not  depend  upon 
sometimes  un'i/ealistlo  distribution  assumptions » such  as  the  normality 
of  error  distribution  and  that  in  many  oases  they  are  more  readily  com- 
prehended and  their  test  etatlstios  more  easily  computed  by  a broader 
spectrum  of  etatistioiens  and  researohere. 

Frlsdmen  in  1937  introduced  a test  which  is  sometimes  referred  to 
as  the  two-way  analysis  of  variance  by  ranks.  The  method  is  outlined 
in  detail  in  Conover  [ref  pp  264-274])  the  teat  la  coneldarcd  to  be 
the  non-parametric  version  of  the  familiar  parametria  two-way  analysis 
of  variance  (ANOVA) . The  pareunetrio  ANOVA  is  the  usual  way  of  tasting 
tha  hypothesis  of  no  treatment  dlfferencee.  For  experiments  of  the  ran- 
domized block  design#  and  where  there  is  one  obeervatlon  per  block,  the 
Friedman  teet  le  used  as  a non*  naramatrio  method  to  test  this  earns  hypo- 
theeie. 

The  subject  of  this  paper  is  related  to  an  extension  of  tha  Friedman 
test  to  the  case  of  several  observations  per  block,  givan  in  Conover 
Iref  1,  p 273].  The  example  given  in  the  next  section  will  illustrate 
the  use  of  this  extension.  The  data  come  from  unpublished  lecture 
notes  of  reference  4. 

2.  EXAMPLE. 

The  hypothetical  data  of  Table  tH  represent  scoree  on  a reading 
teet  given  to  seventh  grade  students  following  one,  three,  or  five  weekly 
20  minute  training  periods  on  an  electric  talking  typewriter  programed 
to  teach  reading  ekllls.  The  study  was  conduotsd  across  four  different 
■ehoole,  drawing  from  different  social  strata  in  the  community  and 
taught  by  four  different  sets  of  teachers  in  four  different  classroom 
environments. 
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'iO»l«  [X]  seorea  on  a Raadlng  Teat  Fallowing  One>  Three  > or  Five  Weekly 
:0  Minute  Training  Perioda  on  an  Bleotrio  Talking  Typewriter  for  Four 
ilffarent  Sohoola. 


ohool 


1 

Saaaiona  per  Week 
3 

S 

110 

82 

i: 

87 

84 

{ 

79 

74 

li 

102 

70 

i; 

■ 

102 

83 

40 

72 

64 

39 

60 

61 

62 

105 

67 

68  ] 

60 

87  ] 

SO 

69  : 

80 

65  ; 

The  data  of  Table  1 are  ranked  within  eaoh  block • Theae  rankinga 
appeara  in  Table  S.  The  aum  of  ranka  R4  are  alao  given. 


able  2 Obaervatlona  Ranked  Within  Blooka  and  the  Sun  of  lianka. 


Beaaiena  per  Week 


ohool 


Table  oontinued  on  following  page 
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Seaalons  per  Week 

School 

1 3 

5 ■ 

^2  <*  B0.5  Rj  * 


The  flxpeotad  valua  of  Rj  ia  given  byt 

E(Rj)  - . (4)14)  t{4.)J3)tll 

. 14(13) 

2 

• 104 

Where  b ■ # bloo)te  (aohooli) 

)(  ■ treetmenta  (aeisiona  per  wee)t) 
m ■ # obeervetlone  per  cell 

The  Friedman  teat  atatiatio  la  given  by 
T - JJ L [Ra-B(Rj)]^ 

• b)^(mk+l)  i-1  J 3 

12 E I*.-I04l» 

(4) (3) (16)  (13)  j-1  3 

- is 127^+23.5^+50.5^1 

(12) (208) 

« -L.  [729+552.25+2550.25] 

208 

. 3831.5 
208 


- 18.4 
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'Hie  diat^ribution  of  T4  can  be  approximated  by  the  chi-square  distribution 

With  k-i  dagreea  of  fivaedomt  .For  this  exaitiple  k-l«*2  and  x3..(2)  m s.99. 

1 95 

Thus,  we  would  reject  a null  hypotheeie  of  no  treatment  dlfferenoea. 

3 . PROBLEM. 

In  the  preceding  eeotion.  the  reaulta  of  the  extenalon  of  the  Frled- 
tnan  test  to  the  oaae  of  eoveral  obaervatlens  indloate  that  significant 
differences  between  the  three  treatments  exist  at  the  a <■  .05  level.  A 
natural  question  arlaaa,  l.a.,  which  treatments  differ  significantly 
In  a statistical  sense?  no  post-hoc  pairwise  comparison  procedures 
are  given  In  Conover  for  this  extension.  Alsor  Hollander  and  Wolfe  (1973) 
do  not  address  this  problem.  A possible  solution  lies  in  extending 
a theorem  given  by  Marasoullo  and  Mcfiwsaney  (1967)  which  is  given  in 
the  next  section. 

4.  THEOREM  (MARA8CUILO  - MoSWEENlYl  . Let  ij;  - 6j  + aj6,  + . . . + 
a]f6]fi  where  k 

t Si  ■ 0 ie  a linear  contrast  of  unknown  parameters.  Con- 

1»1 

eider  the  eet  of  all  possible  linear  oontrasts  of  the  form  tjt.  Let 

$ ■ Sj  61  + aj0a  + . . . a^0|,  (3) 

be  an  estimate  of  with  eetlmated  variance  given  by 

Var  (i|l)  - E aj  Var  (0}^)  + a j £ a^  a^f,*  Cov  (0i,ei»)  (4) 

i-i  i<j. 

Then  in  the  limit  the  probability  is  l-o  that  simultaneously  for  all 
linear  oontrasts  of  the  form 

^ " ‘'x*  (k-i)  (kirr 

j-a' 

The  reader  will  note  that  this  theorem  is  a chi-square  analog  to  the 
more  familiar  loheffe''  theorem. 

The  proof  of  this  theorem  may  be  obtained  from  Marasoullo  and  MoSweensy 
(referenoe  3)  upon  raqueet. 

5.  APPLICATION  OF  THE  THEOREM.  Let  be  the  Bum  of  the  ranks  as  in 
seetion  3 . Let 

Ip  “ ai6i  + aj6g  + , . . + aj^6^  (5) 
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I'riktM-r 


be  a linear  contrast  with  estimate 


3'  ■ aj  Rj  + ttj  Rj  + . . . + aj{  Hjj  (6) 


7ha  vairlanoc  of  the  oontraat  will  ba  detarmined  two  wayaj  assuming  inde- 
pendanoe  between  treatment  observations  [i.e. r cov  0]  and  th« 

ease  where  the  assun\ption  of  independsnoe  cannot  be  justified  [1.6m  Cov 

^01'  01*^  • 

A ^ 

a.  If  Cov  ■*  0 


Var  (iji)  - Sj*  Var  (Rj)  + a*  Var  (Rj)  + . . . a^  Var  (Ry.) 
jbm^  (w  k-t-1)  (k“l)1  Ea^* 


Where  Var  (Rj^)  la  given  in  Conover  (p.  273). 


ar  (4i) 


b.  If  Cov  (6^(  6^j  ft  0 


(7) 


l/ar  (i|)) 


^Ibffl(m  k+1) (mk  - m+l)! 

E.^» 

(B) 

L 12  J 

Now 


■ Rj 
ijj  « R, 
<|»8  ■ Rg 


Rg  • 77-80. S-  -3.S 
Rj  - 77-154. S « -77.5 
Rj  ■ 80.5-154.5  ■ -74 


9S 


nr*  th«  possible  pairwise  ooinparlsons  and  their  estimated  values  from 
our  original  exanyple.  In  ordur  to  test  tnese  values  for  significance, 
we  apply  the  Marascullo  - MeSweeney  theorem  and  confute  the  critical 
differences. 


a.  If  Cov  6^1)  > 0 

CD  ■ /S.99 
• (2.45) 

- 40.79 

b.  If  Cov  (6^,  Bj^i)  ft  0 


CD  - /sTir 


(2.45) 

43.28 


(17.66) 


Any  contrast  which  has  an  absolute  value  greater  then  CD  is  a statistl- 
oally  significant  contrast.  Thus*  at  the  a ■ .05  level  of  slgnlfloanoe, 


ijfj  and  i|;3  are  significant  contrasts.  Therefore,  In  relationship  to 
our  example.  It  would  appear  that  five  sessions  per  week  are  neoessery 
to  increase  the  test  scores  end  improve  reading  skills.  This  conclusion 
is  consistent  with  the  findings  of  the  example  source  (reference  4). 
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ESTIMATE  OF  RELIABILITY  IN  THE 
STRESS- STRENGTH  MODEL 


Asit  P.  Basif 

University  of  Missouri-Coluzbls 
ABSTRACT 

Suppose  Y is  the  strength  of  a component  which  is  subject  to  a 
stress  X.  Then  the  component  fails  whenever  X % Y,  and  there  is 
no  failure  when  X < Y.  In  this  paper  the  problem  of  estimating  the 
reliability  function 


R - P(X  < Y) 

is  considered.  A survey  of  available  results  is  presented  and  some 
new  results  are  considered. 


^Research  supported  by  Army  Research  Office  under  Grant  NO.  DAA  29-76- 
G-0301  and  by  the  Air  Force  Office  of  Scientific  Research  under 
GtantNo.  AFOSR-7S-2795B . 
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INTRODUCTION 


Let  X and  Y be  two  random  variables  with  cumulative  distri* 
bution  functions  F(X)  and  GCy)  respectively.  Suppose  Y is  the 
strength  of  a component  subject  to  a stress  X.  Then  the  component 
fails  ii  at  any  moment  the  applied  stress  load)  is  greater  than 
its  strength  or  resistance.  The  stress  Is  a function  of  the  environ- 
ment to  which  the  component  is  subjected,  and  its  value  at  any  point 
of  time  is  considered  a random  variable.  The  strength  of  a component 
is  measured  by  the  stress  required  to  failure.  Strength  depends  on 
material  properties,  manufacturing  procedures  and  so  on.  If  the  com- 
ponents under  question  are*  mass  produced  and  their  selection  in  a 
given  system  is  assumed  to  be  made  at  random,  then  the  strength  should 
also  be  considered  a random  variable.  The  reliability  of  a component 
during  a given  period  [0,T]  is  taken  to  be  the  probability  that  its 
strength  exceeds  the  stress  during  the  entire  Interval,  that  is,  the 
reliability  function  R is  given  by 

R ■ PCX  < Y) 

Prom  practical  considerations  it  is  desirable  to  draw  inference  about 
the  reliability  function.  The  problem  6f  estimating  R has  been  con- 
sidered by  many  using  nonparametric,  Bayesian  and  parametric  approach. 
We  shall  present  a survey  of  available  results  and  consider  some  new 
results. 

The  above  model  was  first  considered  by  Blrnbaum  (1956)  and  has 
since  found  an  Increasing  number  of  applications  in  many  different 


areas,  especially  In  the  structural  and  aircraft  industries. 


As  an  example,  consider  the  following  problem  discussed  by 
Lloyd  and  Lipow  (1962).  A solid  propellant  rocket  engine  is  succ( 
fully  fired  provided  the  chamber  pressure  (X)  generated  by  ignitjon 
stays  below  the  burst  pressure  (Y)  o£  the  rocket  chamber.  If 
X a Y,  the  engine  blows  up  and  the  operation  is  a failure. 

Note  the  problem  of  inference  about  R ••  P(X  < Y)  is  similar  to 
the  problem  of  estimation  of  P ■ P(X  a Y),  the  probability  of  fail- 
ure. So  one  can  either  talk  of  R,  or  of  P. 


2.  Nonparamotric  approach 

Let  ^ ^2****^ ^m^  and  (Y Y2,... he  two  independcn t 
samples  of  measurements  oii  X and  Y respectively.  Let 


4.  Y^.  A 

••  \^0,  otherwise 


then 


s» 

U-  J I ♦(Xj,  YO 
i-l  J«1  ^ ^ 


Is  the  well  known  two  sample  Mann-^Whltney  statistic,  that  is 


U«  number  of  pairs  (X^,  Yj)  such  that 
Y,  < Xj  . 


Blrnbaum  (1956)  showed  that  the  Mann-Whltnpy  statistic  U could  be 
used  to  estimate  1 - R (Probability  of  failure),  and  hence  R.  Ti, 
particular 


f ■ 1 - I ■ lymn 


C2.1) 


was  proposed  as  an  estimator  of  P • Pr  (failure),  and  it  was  used 
to  obtain  one  sided  confidence  interval  for  P for  the  cases  F 
known,  G unknown  (m  •»>  •>),  and  both  F and  G unknown.  Birnbaun 
and  McCarty  (1958)  considered  a numerical  procedure  for  computing 
the  sample  sizes  needed  for  the  confidence  interval  based  on  U/mn, 

Owen,  Craswell  and  Hanson  (1964)  showed  that  the  assumption  of 
continuity  required  in  Blrnbaum  (1956)  was  not  essential  and  produced 
some  tables  for  use  in  computing  sample  sized  and  Confidence  intervals 
for  the  Birnbaum-McCarty  j»rocedure. 

Govindarajulu  (1968)  also  has  explicitly  derived  one  sided  and 
two  sided  distribution  free  confidence  bounds  for  P based  on  the 
asymptotic  normality  of  ^ ■ U/mn.  This  bounds  are  approximately 
one  half  of  the  corresponding  bounds  due  to  Blrnbaum  and  McCarty 
(1958).  In  particular,  Govlnderajulu  showed  that  for  all  F and  G and 
large  m or  n,  the  solution  € of  the  equations 


P(P  + c)"P(Pa^  - «)  aV»  0<Y<1 
is  given  by 


' $ 


« a (4  V)  ^ ♦ (Y), 


and  the  solution  of  the  equation 


P(|  ^ - P I s «)  fe  Y.  0 < Y < 1 
100 


..  i 


is  given  by 


c k (4  V} 


:V2  -1 


Here 

♦(X)  ■ L , 

“» 

And  is  the  inverse  function  of  *CO* 

Recently  Govindarajulu  (1S74)  has  also  considered  a sequential  dis- 
tribution-free procedure  for  obtaining  fixed-width  confidence  limits 
fbr  P.  (and  hence  for  R).  However » in  the  absence  of  additional 
numerical  computation)  it  is  not  known  how  good  Is  the  performance 
of  this  sequential  procedure. 

S*  Bayesian  Approach 

Not  much  has  been  done  from  the  Bayesian  point  of  view 
Bnis  and  Qeisser  C1B71)  investigated  Bayesian  approach  for  estimating 
R assuming  X and  Y to  be  Independently  distributed  and  that  X and 
Y are  either  exponentially  distributed  or  normally  distributed. 

4.  Parametric  Approach 

In  many  situations,  the  distribution  of  X or  (of  both  X and  Y) 
will  be  known,  and  it  is  desired  to  obtain  parametric  solutions. 

Thus,  in  case  of  missile  flights,  the  stress  may  be  expensive  to 
sample,  but  the  physical  characteristics  of  the  missile  system,  such 
as  the  propulsive  force,  angle  of  elevation,  changes  in  atmospheric 
condition,  and  so  on  may  all  have  known  distributions  ; consequently. 
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thf  distribution  of  stresses  may  be  calculated.  In  this  section,  we 
shall  consider  the  problem  of  estimating  R (or  P)  for  specific  para- 
I metric  distributions. 

f 

I . ■ , 

I 4,1  Normal  Distribution;  Owen,  Craswell  and  Hanson  (1964)  considered 

f 

I above  problem  and  gave  one  sided  confidence  intervals  for  R when 
I both  stress  and  strength  are  (a)  jointly  bivariate  normally  distrlbut* 

t' 

ed  anid  observations  are  in  pairs,  6r  (b)  when  X and  Y are  Indepen* 

h- 

I dent  normal  with  a common  unknown  variance.  Note  if  X and  Y follow 
a joint  bivariate  distribution 


and 

« O^.-Cty 


R ■ PCX  < V)  • PCV  - X > 

, , / a.  '“y — fx \ 

\{<rj  - ^ ♦ Oy*)''/ 

ft  - ♦ (Y  - X / (cj  - 2 pOjj  t 

and  p are  known.  Similarly  if  X and  Y are  independent 


PCX  < Y)  - P(x)  dGCx). 

Same  problems  have  been* considered  by  Govidarajulu  (1976),  who  obtain- 
ed two  sided  confidence  intervals  for  R.  Church  and  Harris  (1970) 
have  also  considered  the  same  problems  under  the  assumption  that  X 
and  Y are  Independent,  normally  distributed  and  the  distribution  of 
X is  known.  Assume,  without  any  loss  of  generality,  that  E(X)«o 
and  Var(X)"l.  In  this  case, 

R - P{X  < Y)  - 


2 


E(Y  - y)^,  • .Church  and  Harris  considered 


where  y ■ B(Y)  and  o*  ■ 
considered  the  estimator 


R fi~*(-«ea«as8,|  • *CV),  SSy, 

yi  + s' 


where  Y - J-  ? Y.  and  •*  - S “ 7)V(n  - 1)»  from  which  they 
ehtalned  the  following  confidence  Interval  for  R. 

K»CV  - • 1^)  9y)  < R < ♦ (V  ♦ ♦‘^(1  - '8y))»l‘ir 


Slmllairlyi  a one  sided  confidence  interval  is  given  by 


,I»{R  > ♦ (V  - *“^(1  - Y ) oJ)»l-Y 


Hera 


A 

0 * 


2 '*  n 

U.  m ^ •• 


^2,2 


1 + s 


2^2 


2(n  - 1)(1  ♦ 8 ) 


The  confidence  Interval  obtained  by  Church  and  Harris  compare 
favorably  with  that  of  Govlndarajulu  (1968),  Their  procedure,  al- 
though empirically  demonstrated  to  be  superior  to  that  of  Govlndarajulu 
is,  however,  inexact  since  it  uses  the  asymptotic  normal  approxima'- 
tion  of  a given  statistic  and  requires  the  substitution  of  the  popu- 
lation moan  and  standard  deviations  by  thexr  observed  sample  values, 

In  fact,  all  the  parametric  estimators  suffer  from  same  weakness  as 


estimateB  of  reliability  and  obtained  mvue  of  reliability  using 
interference  theory.  Minimum  variance  unbiased  estimator  of  R in 
the  normal  case  has  also  been  considered  by  Dovmton  (1975). 

4.2  Gamma  and  Bxnonential dt8trlbutlnn.i  .Since  in  many  physical 

situations  I specially  in  reliability  and  life  testing  problems,  ex- 
ponential and  gamma  distributions  provide  more  realistic  models, 
it  is  desirable  to  obtain  estimators  of  R in  these  cases. 

Let  X and  Y be  independently  distributed  with  density  func- 
tions 

£(x)  ■ i—  ‘ , X > 0,  p > 0 

r(p)oP 


s(y)  ■ — K — » y > 0,  q > 0 

Rq)  r 


I 


w- 


respectively.  Then 

R ■ P(X<Y)  ■ r Il-G(x)3  dP(x) 

0 

. /-  f/-  —1—  e“y/^  y^^-^dyl  -i-  e‘*/“  x^'^dx 

0 U r(q)0*^  J rftjptP 

« r CP-»hD  . 

k-0-  rcp)r(Jc^i)  (d+0)^ 

104 


Here  p end  q ere  essumod  to  be  !knoWn  integers,  If  two  independent 
rendom  semples  end  (Yj^,  Y2. . from  the  two  gamma 

pepuletions  ere  available  mle  of.  a and  $ are  given  by  end 

A V . 

Hence  mle  of  R is 

R . miSi_  slilL—  . 

,k-o  rcpjfck+i) 

As  special  cases,  if  q*l,  that  is  if  X follows  the  gamma  distri- 
bution and  Y follows  the  exponential  distribution 

I 

R-‘{t/«l+0}P 

i < 

Finally,  if  both  p and  q are  equal  to  1,  we  have  the  case  of  two 
independent  exponential  distributions  and  we  have 


R ■ ..A.  ■ ..-jL. 
a + j 3f  + Y 

The  distribution  of  fi,  for  large  m and  n,  can  be  shown  to  be 
normal  and  hence  asymptotic  confidence  Interval  for  ^ can  be  ob- 
tained. 

Tong  (1974,  1975)  has  obtained  mvube  of  R for  gamma  and  ex- 
« 

ponential  distributions.  The  variance  of  the  muvbe  of  R,  in  the 
exponential  ease  has  been  derived  by  Kelley  et  al  (1976) 

4.3  Weibull  distribution;  Let  X and  Y be  Independent  random 

variables  each  following  the  Weibull  distribution  with  common  shape 
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1 

i 


: ,'S 
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parameter  6,  That  is  let 


G(y) 


a > 0 I X > 0 

0 > 0 , y > 0, 


We  can  readily  see 

R - PCX*  < - PCX  < Y)  - -6 — 

a +0 


Note  above  is  independent  of  d.  Again,,  we  can  obtain  the  tnle 
of  R to  be 


R - e/Cfi  + 8) 

where  S‘and  8 are  mle  of  o and  0. 

4.4  Bivariate  exponential  distribution;  Since  exponential  distri- 
bution is  considered  a useful  model  in  life  testing  problems,  it  is 
desirable  to  consider  bivariate  analogue  of  univariate  exponential 
distributions  which  will  have  properties  similar  to  the  univariate 
exponential  distribution.  Marshall  and  Olkin  {1967)  have  proposed 
a very  Important  bivariate  exponential  distribution  CBVE} , which 
is  given, by 


The  BVE  does  arise  in  several  natural  ways  and  Its  properties  ap- 
pear to  be  fundamental.  In  particular,  marginal  distributions  of 
BVB  are  exponential  and  BVE  has  the  loss  of  memory  property  (IMP) 
given  by 


F(x+t,y+t)  ■ F(x,y)FCt,t)  for  Sj^,  S2»  tsO 
However,  this  distribution  is  not  absolutely  continuous  and  there 

4| 

are  clearly  situations  when  it  can  not  be  applied.  Thus,  from  data, 
it  is  found  that  for  any  pair  (X,Y)  the  model  is  clearly 

not  applicable.  An  alternative  absolutely  continuous  distribution 
related  to  the  BVB  and  having  some  of  its  properties  would  appear 
to  be  of  interest.  To  this  end,  Block  and  Basu  (1974}  have  proposed 
an  absolutely  continuous  bivariate  exponential  extension  (ACBVB) , 
which  turns  out  to  be  the  absolutely  continuous  part  of  the  BVE  of 
Marshall  and  Olkin.  ACBVB  is  also  seen  to  be  a variant  of  the 
distribution  Freund  (1961^ . The  ACBVB  is  given  by 

|f‘Cx,y)  - —i—  exp[-l;^x-X2y-l22"'»^C*»y5] 

Xj^+X2 

^12 

- exp[-XmaxCx,y)]  for  x>0,y>0. 

Xj+Xj 

Here 


^ " ^1  ^2  ^12  ' 
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Ustimates  of  R when  the  underlying  distribution  is  BVB  or 
ACBVB  has  been  obtained  by  Basu  (1976).  These  results  will' be  com- 
municated elsewhere. 

5 . Reliability  of  complex  systems 

The  model  described  before  can  be  extended  to  more  complex  sys- 
tems. For  example  I a single  component  system  of  strength  Y could 
be  subjected  to  k different  independent  stresses  'X^,X2 » • . «X^. 

Here  reliability  of  the  system  is  given  by 

R - P{X^<Y,  X2<Y»  ...  ,Xjj<Y) 
or 

R ■ P{mtx(Xj_,X2,  ...  ,Xj^)  < Y). 

An  example  of  interest  Is  the  case  where  a beam  with  strength  Y is 
subjected  to  several  stresses  Xj^iX2i  ...  |X]^.  Another  similar  pro- 
blem of  Interest  is  to  evaluate  the  reliability  function  R'  of  a 
k -component  system  of  strengths  Tj^,Y2i  ...  , Yj^  respectively 
each  of  which  is  subject  to  a common  stress  X.  Here 

R ■ P{X<Y^,  X<Y2 %<\) 

■ P{X<min  (Yj,  ...  , Yj^))  . 

As  an  example,  the  flow  of  a current  X through  an  electronic  com- 
ponent assembled  from  several  subcomponents  with  abilities  to  accom- 
modate currents  Y^,Y2»  ...  ,Yj^  would  follow  this  pattern. 
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Chandra  (1975)  has  considered  the  problem  of  estimating  R and 
R*  under  the  assumption  that  the  X's  and  Y's  are  all  Independent 
random  variables  and  (a)  all  follow  normal  distributions,  (b)  Y's 
are  all  exponential  and  X is  normal  with  known  variance. 

Bhattacharyya  and  Johnson  (1974)  considered  the  problem  of 
estimating  roliabillty  function  R for  a more  complex  m*out-of-k 
system.  Here  each  of  m components  of  a system  of  strengths  Yj^, 

Y21  •••  iYj^  is  subjected  to  a stress  X and  the  System  survives  if 
at  least  m out  of  the  k components  survive.  Assuming  X,Yj^,  ...  ,Y^ 
to  be  Independent  with  distribution  functions  P(x),  G^fyj|),  G2C/2)i 
...  Bhattacharyya  and  Johnson  considered  the  problem  of 

estimating  the  reliability  function  R*P^(at  least  m of  the  Y^,  ..., 
Yjj  exceed  X),  under  the  assumption  Gj-Qj".  ."Gj^-Q,  sa>(  and  that  P 
and  G are  exponential  distributions  with  known  scale  parameters. 

Here 

R.  I W /■  . 

«-m 

Bhattacharyya  and  Johnson  (1975)  have  also  considered  a nonparametric 
approach  for  the  above  problem. 

The  author  is  currently  investigating  additional  problems  In 
this  area  .results  of  which  will  be  communicated  elsewhere. 
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ABSTRACT.  The  fracture  mechanics  studies  of  gun  tube  fatigue 
conducted  thus  far  are  essentially  deterministic.  That  is,  crack 
growth  and  failure  are  described  exactly  by  assuming  that  all  pertinent 
parameters  are  known.  Much  information  has  been  gained  by  this 
approach  in  studying  the  important  parameters  that  affect  fatigue  life. 
Fatigue  life,  however,  is  known  to  be  a random  variable.  The  proba- 
bilistic nature  of  fatigue  life  must,  therefore,  be  taken  into  accotmt 
in  the  development  of  gun  tubes. 

The  development  approach  used  at  the  present  time  is  to  schedule 
gun  barrel  replacement  early  enough  to  forestall  failure  during  firing. 
Since  fatigue  life  is  a random  variable,  this  is  accomplished  by 
statistically  determining  a "safe  life"  from  fatigue  test  results  on  a 
small  number  of  tubes. 

In  this  paper,  a probabilistic  approach  starting  with  existing 
theories  of  fracture  mechanics  is  used  to  determine  the  best  fit  theo- 
retical distribution  of  life.  The  main  purpose  is  to  improve  the 
present  statistical  methods  for  determing  safe  life  by  providing  a 
basis  for  choosing  a distribution  in  analyzing  small  sample  data.  Tlie 
approach  used  is  to  assume  that  the  material  properties  and  design 
parameters  in  crack  growth  and  failure  laws  are  random  variables. 

Fatigue  life  is  then  given  as  a function  of  a number  of  random  variables. 
The  fatigue  test  results  for  the  lOSmm  M137A1  and  175mm  H113B1  tubes 
are  used  as  bases  to  estimate  means  and  variances  of  the  model  para- 
meters. Monte  Carol  simulation  studies  are  then  conducted  by  assuming 
various  probability  distributions  for  the  model  parameters  and  computing 
the  statistics  of  the  distribution  of  fatigue  lives.  Results  of  the 
Monte  Carlo  studies  indicate  that  the  best-fit  theoretical  distributions 
of  fatigue  life  are  the  2-  and  3-parameter  log-normal. 

1.  INTRODUCTION.  The  general  problem  considered  is  the  fatigue 
failure  of  gun  tubes  resulting  from  repetitive  firing  pressure  cycles. 
Numerous  studies  have  been  performed  at  the  Watervllet  Arsenal  and 
elsewhere  on  fatigue  crack  growth  and  failure  of  gun  tubes  [1-12]. 

These  studies  include  both  theoretical  fracture  mechanics  which  relate 
material  properties  and  design  parameters  to  crack  growth  and  exper- 
imental measurement  on  actual  gun  tubes  of  crack  depth  versus  number  of 
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cycles. 
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The  fracture  mechanics  studies  conducted  thus  far  are  essentially 
deterministic • That  is,  crack  growth  and  failure  are  described  exactly 
by  assuming  that  all  pertinent  parameters  are  known.  Empirical  methods 
are  used  to  estimate  some  of  the  model  parameters.  Much  Information 
has  been  gained  by  this  approach  in  studying  the  Important  parameters 
that  affect  fatigue  life  [10>12].  Fatigue  life,  however,  is  known  to 
be  a random  variable.  The  probabilistic  nature  of  fatigue  life  must, 
therefore,  be  taken  into  account  in  the  development  of  gun  tubes. 

The  development  approach  used  at  the  present  time  is  to  schedule 
gun  barrel  replacement  early  enough  to  forestall  failure  during  firing. 
Since  fatigue  life  is  a random  variable,  this  is  accomplished  by 
statistically  determining  a *'safe  life"  from  fatigue  test  results  on  a 
small  number  of  tubes  [4*6,13,14].  The  safe  life  is  a statistical 
tolerance  limit  [15]  for  fatigue  life  for  which  current  specifications 
require  at  least  a 0.999  probability  that  tubes  will  survive  the 
specified  safe  life.  This  is  determined  by  first  assuming  a theoretical 
distribution  of  fatigue  life  and  then  statistically  computing  the  0.999 
tolerance  limit  at  904  confidence  from  a six  tube  test.  The  main  draw- 
back of  this  approach  la  the  lack  of  justification  for  choosing  the 
theoretical  distribution.  In  the  past  the  3-parameter  Weibull  has  been 
arbitrarily  assumed  [4-6,13]. 

In  this  paper,  a probabilistic  approach  starting  with  existing 
theories  of  fracture  mechanics  is  used  to  determine  the  best  fit  theo- 
retical distribution  of  life.  The  main  purpose  is  to  improve  the 
present  statistical  methods  for  determing  safe  life  by  providing  a 
basis  for  choosing  a distribution  in  analysing  small  sample  data. 

The  approach  used  here  is  to  assume  that  the  material  properties 
and  design  parameters  in  crack  growth  and  failure  laws  are  random 
variables.  Fatigue  life  is  then  given  as  a function  of  a number  of 
random  variables.  The  fatigue  test  results  for  the  105mm  M137A1  and 
175mm  M113B1  tubes  [4,5]  are  used  as  bases  to  estimate  means  and 
variances  of  the  model  parameters.  Monte  Carlo  simulation  studies  are 
then  conducted  by  assuming  various  probability  distributions  for  the 
model  parameters  and  computing  the  statistics  of  the  distribution  of 
fatigue  lives  [16,  p.  124]. 


2.  PROBABILISTIC  WDBl  BASED  ON  FRACTURE  MECHANICS.  There  are 
essentially  three  phases  in  the  fatigue  failure  ol  gun  tubes:  1)  initi- 

ation of  cracks;  2]  stable  crack  growth;  and  3)  failure  through  unstable 
crack  growth  or  perforation  of  the  tube  surface.  Initiation  of  cracks 
occurs  very  early  in  the  life  of  a tube  due  primarily  to  the  heat 
effects  of  firing  the  first  few  rounds  [5,10].  The  main  phenomena  in 
tube  fatigue,  therefore,  are  crack  growth  and  failure. 
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The  theories  of  fracture  mechanics  for  fatigue  of  tubes  are  well 
covered  in  the  literature  and  Arm/  reports;  so  onl/  the  final  results 
are  summarized  here  (see  [11]  and  references  listed  in  this  paper) . 
The  crack  growth  model  used  in  this  study  is  based  on  the  Paris  [17] 
expression  for  rate  of  crack  growth  and  on  analyses  and  experimental 
results  of  Throop  [12],  throop  and  Miller  [11],  and  others  [1-10]. 

The  rate  of  crack  growth  is  approximated  by  the  expression 


db  , (AK)" 
dN  M 


CD 


in  which  b ■ crack  depth 

N > number  of  cycles 

AK  > range  of  variation  of  stress  Intensity  factor  K 
fur  one  cycle  (see  [18]  for  discussion  of  stress 
intensity  factor) 

m • empirical  parameter  dependent  on  material  and 
stress  intensity 

M " empirical  parameter  dependent  on  material 
properties. 

In  the  Throop  model  [12],  a value  of  m equal  to  3.0  gives  an 
adequate  overall  fit  to  tube  fatigue  data  although  m is  known  to  vary 
from  speclmen-to- specimen  and  for  different  tube  designs.  The  vari- 
ables AK  and  M in  this  model  are  given  as 

tK  - aS/frF  (2) 

M ■ BK  0 /C  (3) 

IC  y 

in  which  S ■ maximum  hoop  stress  at  the  bore  of  the  tube, 

■ P(w*+l)/(w*-l);  P ■ internal  pressure,  w ■ 

O.D./I.D. 

a empirical  parameter  which  depends  on  crack  shape 
and  residual  stresses.  Compressive  residual 
stresses  at  the  bore  of  the  tube  are  Introduced 
using  the  autofrettage  process  [19,20]. 

E " Young's  modulus 

Kjq  fracture  toughness  for  a crack  in  a tangential 

stress  field.  Krr  is  the  value  of  stress  intensity 
K at  which  unstaole  crack  growth  begins. 

Oy  ■ yield  strength 

C > empirical  parameter  which  varies  with  m to  maintain 
dimensional  homogeneity  and  may  be  a function  cf 
other  material  propertios. 


Substituting  (2)  and  (3)  into  (.1)  gives 

dN  EOy  Kjg 

In  the  probftbilit/  model ^ the  exponent  m is  allowed  to  be  a random 
variable  with  the  mean  being  determined  empirically.  The  variables 
E,  Oy,  K|c«  a and  S are  random  variables. 

All  of  the  parameters  in  (4)  can  statistically  vary  from  cycle* 
to-cycle,  as  a function  of  crack  depth  and  for  different  cracks  within 
a given  tube,  Depth  measurements  of  the  largest  crack  versus  number 
of  cycles  as  well  as  results  of  probabilistic  studies  Indicate,  however, 
that  the  greatest  sources  of  fatigue  life  variability  stem  from  tube*to* 
tube  variability  in  the  controlling  crack  growth  parameters.  Fatigue 
crack  growth  in  a given  tube,  therefore,  is  essentially  deterministic 
in  comparison  to  tube-to*tube  variability.  The  problem  then  reduces 
to  integrating  (4)  assuming  that  material  and  tube  parameters  remain 
constant  within  a given  tube: 


N£  • N-Nj  ■ 


2EOyKic 

C(oS/if)'"Cin-25 


-^(m-2) 


for  m ft  2 


-x(m-2) 
b ^ ) 


CCaS)  *iT 


ln(b/bi) 


for  m ■ 2 


in  which  b^  > initial  crack  depth  which  depends  on  the  heat 
affected  zone  and  residual  stresses. 

■ Initial  number  of  cycles  yielding  b^. 

In  CS)>  Nj,  is  relatively  small  and  can  be  assumed  zero.  The  initial 
crack  depth  bj^  is  assumed  to  be  a random  variable. 

Failure  occurs  when  the  crack  depth  b is  either  equal  to  the  tube 
wall  thickness  B or  equal  to  the  critical  depth  at  which  unstable  growth 
begins.  Unstable  crack  growth  in  tubes  occurs  when 

A * 

be  - “ C-~3  (63 

In  which  be  « critical  crack  depth 

A ■ empirical  constant  which  accounts  for  differences 
in  crack  shape  in  the  tube  and  in  the  specimens 
used  to  determine 

Finally,  fatigue  life  Nf  is  equal  to  (N-N^)  in  (5)  where  b ■ min 

(B.bc). 
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3.  LEVELS  OF  VARIABILITY  OP  MATB^AL  PROPERTIES  AND  DESIGN 
PARA^TERST  Equation  ^5}  is  a model  #ati^e  life  glvtn  as  a 
lunotlonTf  random  matarlal  and  design  paramotars.  The  theoretical 
forriis  of  the  distributions  of  the  different  model  parameters  are 
unknown.  The  normal,  lognormal  and  Weibull  distributions  [21]  were 
consequently  assumed  for  the  model  parameters  in  studying  the  form 
of  the  distribution  of  Nf.  For  these  distributions,  the  mean  and 
variance  of  each  parameter  are  sufficient  to  fully  characterise  the 
random  variables. 

Available  test  data  for  the  lOSmm  M137A1  and  17Smm  M113B1  tubes 
were  used  as  bases  to  estimate  means  and  variances  of  the  model  para- 
meters. Once  the  model  parameters  are  characterized  in  a probabilistic 
sense,  sensitivity  studies  can  be  performed  to  determine  Important 
factors  that  influence  the  statistics  of  Nf. 

a,  lOStnm  M137A1  Tube  Data.  Table  I lists  fatigue  life  and 
property  data  for  nine  lOSmm  tubes  [4].  The  fracture  toughness  was 
not  measured  for  these  tubes  and  had  to  be  estimated  from  the  yield 
strength  and  critical  crack  depth  data  using  (6)  and  an  empirical 
relationship  for  Oy  versus  Kic  [22].  In  addition  to  this  data, 
crack  depth  versus  number  of  cycles  data  were  measured  on  these  tubes. 
The  model  parameters  m,  a,  and  b;],  were  estimated  from  this  data  by 
fitting  the  model  (5]  to  the  data.  Figure  1 shows  a comparison  of  the 
model  to  the  data  for  some  of  the  tubes. 


TABLE  I;  FATIGUE  AND  PROPERTY  DATA  FOR  105MM 

MIWTD515" 


Tube 

No. 

Fatigue  Life, 
Rounds  Cycles 

bc» 

in 

Oy, 

ksi 

ksiv^ 

C2) 

a 

S9421 

16798 

0.80 

196 

90 

.777 

SS071 

12576 

0.80 

190 

99 

.851 

58046 

12469 

1.07 

171 

116 

.864 

59906 

12162 

0.60 

189 

85 

.841 

62103 

10971 

0.85 

192 

107 

.891 

59895 

10801 

0.80 

187 

104 

.892 

S9S27 

10397 

1.05 

204 

121 

.910 

59239 

9503 

0.70 

187 

100 

.921 

59531 

8882 

0.75 

207 

106 

.944 

(1)  Estimates  using  equation  (6)  and  Oy  - 334  - 1.39K|(;  [22]. 

(2]  Estimates  from  crack  depth  vs.  cycles  data. 
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FIOBE  1:  CStOL  DH^TH  VS  tOMBS  PtOS  CYCLES.  42  KSI,  OMPiUtlSQK  OF  lOKL  IHIH  MEASURED  DATA 


Table  II  Is  a summary  of  the  means  and  standard  deviations  of  the 
model  parameteri  either  estimated  from  the  105mm  tube  data  or  assumed 
if  ho  data  was  available. 

TABLE  II ! SUMMARY  OP  MEANS  AND  STANDARD  OBVIATIONS 


Parameter 


Standard 


Do  I Outside  diam.t  in 
Dii  Inside  dlam.,  in 
P,  Max.  Pressurei  ksi 
a.  Crack  shape-residual  stress 
parameter 

Kjci  Fracture  toughness,  ksi/Tn 
Oy,  Yield  strength,  ksi 
bj|_,  Initial  crack  depth,  in 
m.  Rate  exponent 

B,  Young's  Modulus,  ksi 

A,  Critical  crack  depth  constant 

C,  Empirical  constant 


1.39KIC 


(1)  E s Batlmatedi  A 2 Assumed 

b.  nSmrn  M113B1  Tube  Data.  Table  III  summarises  the  fatigue  and 
property  data  either  measureci  or  estimated  from  tests  on  four  175ram 
tubes  [5].  Figure  2 is  a comparison  of  the  model  to  the  crack  depth 
versus  cycles  data  for  these  tubes.  The  means  and  standard  deviations 
estimated  from  data  or  assumed  for  the  model  parameters  are  summarised 
in  Table  IV. 

TABLE  III:  FATIGUE  AND  PROPERTY  DATA  FOR  175NW  M113B1  TUBBS 


(1)  was  adjusted  to  account  for  b^  ■ 2.40  for  tube  4133  by 
applying  equation  (6).  Kq  is  an  estimate  of  Kjo  using  a 
nonstandard  specimen. 

(2)  Estimates  from  crack  depth  versus  cycles  data. 
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TABLE  IV: 


SUMMARY  OP  MEANS  AND  STANDARD  DEVIATIONS  OF 
M6dEL  t»ARAMETERS'  for  175MM  TUBES 


I 


Parameter, 
(See  Table  II 


Mean 


(n 

(2) 


C3) 


Standard 


Do 

IS.O 

0.0 

A 

H 

7.04 

0.0 

A 

P 

46 

0,0(2) 

A 

a 

.8495 

0.045 

E 

•<IC 

135.5 

15.3 

B 

Py(3] 

146 

Oy  ■ 334 

- 1.3! 

bi 

0.06 

o;oos 

A 

m 

3.0 

0.1 

A 

E 

30000 

300 

A 

A 

2.26 

0.0 

A 

C 

0.2413 

0.0 

A 

E = Estimated i A = Assumed 
Tube-tO'tube  variation  assumed  zero;  however, 
cycle>to>cycle  standard  deviation  ■ 0.90  from 
[23]. 

Oy  was  computed  from  the  equation  given.  This 
resulted  in  a somewhat  lower  value  than  the  measured 
values  given  in  Table  III.  The  computed  Oy  Is  still 
within  the  required  specifications  of  UO-160  ksi. 


t 
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4.  BEST  PIT  PROBABILITY  DISTRIBUTION  OF  FATIGUE  LIVES.  In  this 
section,  tKe  mo<iol  expressed  by  equation  ^ is  usedi  to  generate  prob- 
ability distributional  information  for  fatigue  lives  of  tubes.  This 
is  accomplished  by  first  assuming  probability  distributions  for  the 
model  parameters  and  then  using  Monte  Carlo  simulation  to  generate 
the  fatigue  life  distribution.  The  simulation  trials  were  conducted 
as  follows: 


a.  The  general  form  of  the  distribution  for  the  model  parameters 
is  fixed.  A choice  of  one  of  three  possible  distributions  is  used; 
normal,  lognormal  or  Welbull. 

b.  The  mean  and  standard  deviation  for  each  parameter  is  fixed 
using  the  test  results  and  assumptions  given  in  Section  3 as  bases. 

It  should  be  noted  that  the  105mm  and  175mm  tube  data  are  used  only 
to  provide  a starting  point  for  conducting  the  Monte  Carlo  trials. 

c.  A value  for  each  of  the  random  model  parameters  is  generated 
using  random  numbers  [16,  p.  124]. 

d.  The  fatigue  life  for  the  given  sot  of  parameters  is  computed 
using  (5)  and  (6). 
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e.  Steps  c)  and  d)  are  repeated  J times  (usual ly  1,000  to  10,000) 
yielding  J different  values  of  fatigue  failure  times. 

f.  Various  distributional  statistics  are  computed  from  the  J 
failure  times:  eg,  mean,  variance,  coefficients  of  skewness  and 
kurtosis  [16,  p.  146],  99.0  and  99.9  lower  percentiles,  and  the  K-S 
(Kolmogorov-Smlrnov)  statistic  [16,  p.  466]. 

Steps  a)  through  f)  can  be  repeated  for  different  model  parameter 
distributions,  different  values  of  parameter  means  and  standard 
deviations,  different  failure  criteria,  etc. 

A number  of  candidate  theoretical  distributions  were  considered 
for  fatigue  life;  normal,  2>  and  3>parameter  lognormal,  2‘  and  .l-para- 
meter  Weibull  and  gamma  [16,21],  A comparison  was  made  of  the  various 
theoretical  distributions  to  the  Monte  Carlo  modal  distribution.  This 
was  done  by  first  fitting  the  theoretical  distribution  to  the  model 
distribution  by  equating  means  and  variances.  The  third  parameter  In 
the  3-parametor  distributions  were  fixed  by  equating  the  99.9  lower 
percentile  of  the  theoretical  and  model  distributions.  The  reason  for 
this  was  to  match  as  closely  as  possible  the  lower  tails  of  the 
distributions  for  comparative  purposes.  Goodness  of  fit  was  then 
checked  using  the  K-S  statistic  and  by  comparing  the  coefficients 
of  skewness  and  kurtosis  [third  and  fourth  moments)  and  the  99.0  and 
99.9  lower  percentiles. 

The  K-S  statistic  is  a measure  of  tht  maximum  deviation  of  a 
theoretical  cumulative  distribution  from  a set  of  data:  the  lower  the 
K-S  statistic,  the  better  the  fit.  The  data  in  this  ease  are  the  Monte 
Carlo  failure  times.  Table  V lists  the  K-S  statistics  for  the  various 
theoretical  distributions  as  a function  of  parameter  distribution  and 
data  bases. 

TABLE  V!  K-S  STATISTIC  FOR  COMPARING  MODEL  WITH 

77aiT5UTTtffi!^mi"TO’rmi7f'imS 


K-S  Statistic* 

lOSmm  M137A1  Tubes  175mm  M115E1  Tubes 

Failure  Time  Parameter  Distribution  Parameter  Distribution 


Distribution 

Normal 

Lognormal 

Welbul? 

Normal 

Lognormal 

Weibull 

Normal 

.068 

,061 

.120 

.050 

.040 

.109 

2-p  Weibull 

.084 

.078 

.135 

.082 

.073 

,136 

3-p  Weibull 

.081 

.073 

.299 

.143 

.138 

.330 

2-p  Lognormal 

.029 

.022 

.075 

.019 

.010 

.073 

3-p  Lognormal 

.021 

.023 

.046 

.014 

.010 

.036 

Gamma 

.041 

.034 

.090 

.029 

.019 

.085 

*Only  1,000  Monto  Carlo  trials  were  used  in  this  case  to  reduce  excessive 
computer  time. 
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It  should  be  noted  that  the  distributions  of  the  material  and  design 
pATantotefi  in  equation  (S)  are  not  known.  Different  distributione 
were  consequent!/  assumed  to  indicate  the  importance  of  this  factor* 
if  any*  on  conclusions  made  about  the  failure  time  distribution.  The 
K-S  statistics  given  in  Table  V indicate  that  the  2-  and  3«pafameter 
lognormal  provide  the  best  overall  fit  to  the  model  for  the  different 
parameter  distributions  considered. 

An  explanation  is  required  for  why  the  K-S  statistic  in  Table  V 
Increased  in  some  cases  for  the  3>parameter  distribution  in  comparison 
to  the  2-parameter  distribution.  Generally,  one  would  expect  a better 
fit  when  the  number  of  distribution  parameters  is  Increased.  This 
would  be  true  if  the  3rd  parameter  was  chosen  to  minimise  the  K>S 
statistic.  However,  in  the  gun  fatigue  problem  the  main  concern  is 
estimating  probabilities  at  the  lower  tails  of  the  distributions. 

The  third  distribution  parameter  was  consequently  chosen  by  equating 
a given  lower  percentile.  This  resulted  in  a worse  fit  at  the  upper 
tall  for  some  of  the  cases  considered,  particularly  for  the  3-parameter 
Welbull  distribution,  resulting  in  a higher  K-S  statistic. 

In  light  of  the  above  discussion,  it  is  of  interest  to  compare  other 
goodness-of-flt  statistics  which  would  indicate  behavior  at  the  lower 
tails.  Table  VI  lists  the  coefficients  of  skewness  and  kurtosis  and 
the  99,0  and  99.9  lower  percentiles  for  the  model  and  theoretical 
distributions.  The  parameter  distributions  wore  assumed  normal  for 
these  particular  results  with  10,000  Monte  Curio  trials  run  for  each 
case.  Again,  the  lognormal,  particularly  the  3-parameter  lognormal, 
yielded  the  best  overall  fit  to  the  model  statistics.  Compare,  for 
example,  the  99.0  percentiles  of  the  assumed  failure  time  distributions 
to  the  model  value. 

TABLH  VI ! CONfPARlSON  OF  SIMULATED  MODEL  DISTRIBUTION 

wTTirTnEsngrreAL'  btsTminrog 


Coefficients  of  Lower  Percent! lo 


Failure 

Time 

Oist. 

Skewness 

Kurtosis 

99.0 

99.9* 

105mm 

Tubes 

I75mm 

Tubes 

lOSmm 

Tubes 

175mm 

Tubes 

105mm 

Tubes 

175ram 

Tubes 

lOSmm 

Tubes 

175mm 

Tubes 

Normal 

0.0 

0.0 

3.00 

3.00 

5589 

7891 

3598 

5977 

2-p  Welbull 

-0.27 

-0.41 

2.00 

3.11 

5171 

7181 

3298 

5007 

3-p  Welbull 

0.37 

0.26 

2.87 

2.78 

66B8 

8707 

6050 

7954 

2-p  Lognormal 

0.68 

o.ss 

3.84 

3.55 

6802 

8857 

5745 

7712 

3-p  Lognormal 

0.01 

0.65 

4.18 

3.75 

6992 

9003 

6050 

7954 

Gamma 

0.4S 

0.37 

3.30 

3.20 

6456 

8571 

5218 

7256 

Model 

0.86 

0.76 

4.45 

3.07 

6996 

9154 

6050 

7954 

♦The  third  parameter  for  the  3-p  distributions  was  chosen  such  that  the 
99.9  percentile  was  equal  to  the  model  results. 
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Ther*  Is  thsorstical  Justification  for  why  ths  lognormal  could  ba 
axpacted  to  provid*  a rsprassntatlon  of  ths  fatigus  Ilfs  distribution. 
Ths  modsl  (5)  glvss  fatigus  Ilfs  as  a product  of  random  varlablss.  The 
limiting  distribution  for  ths  product  of  an  infinito  number  of  random 
varlablss  Is  the  lognormal  regardless  of  ths  form  of  the  distribution 
of  ths  individual  random  variables  [16,  p.  262].  In  practice  the 
actual  number  of  random  variables  required  to  give  a lognormal  depends 
on  a number  of  factors  Including  the  form  of  the  distribution  of  the 
individuSl  random  variables  as  well  as  accuracy  required  for  the 
distribution  which  is  to  represent  the  product.  For  example,  if  each 
random  variable  in  the  product  is  Itself  lognormal  then  the  product 
is  always  lognormal  regardless  of  the  number  of  random  variables.  It 
appears  that  even  though  equation  (5)  represents  the  product  of  at 
most  seven  random  variables,  this  is  apparently  enough  to  give  a trend 
toward  lognormal  as  Indicated  by  the  results. 


5.  FUTURE  RESEARCH  EFFORTS.  The  results  reported  in  this  paper 
were  based  on  the  particular  ifracturs  mechanics  model  given  by  equation 
(5}.  As  additional  experimental  results  are  obtained  this  model  may 
be  revised  as  well  as  the  values  of  the  model  parameters  and  their 
variances.  The  effect  on  life  distribution  must  be  rechecked  in  this 
instance. 

In  any  case,  a number  of  interesting  studies  may  be  pursued  using 
the  developed  probabilistic  model: 

a.  determine  the  relative  effects  of  variability  in  design  and 
material  parameters  on  the  variability  of  fatigue  life: 

b.  study  possible  methods  of  increasing  safe  life  through  control 
of  statistical  parameters; 

c.  study  different  methods  of  computing  safe  life;  and 

d.  improve  the  initial  design  approach  for  new  gun  tubes. 
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ESTIMATION  AND  EEFECT  OF  NOISE  CORRELATION 
ON  VARIANCE  ESTIMATION  FROM  MOVING  ARC  SMOOTHING 

Paul  H.  Thrasher 
Quality  Assurance  Office 
US  Amiy  White  Sands  Missile  Range 
White  Sands  Missile  Range • New  Mexico 


^STRACT.  Correlation  in  the  noise  on  Y»  in  measuremants  of  Y versus  X 
with  ll^'aseurned  exact)  does  not  fomally  effect  the  moving  aro  least-squares 
estinete  of  Y.  ‘It  doesi  however,  effect  the  vard^ce  estimate  of  Yi  Analysis 
has  been  done  to  find  correction  factors  to  the  zero  correlation  estimates  of 
(1)  the  moving  arc  smoothing  factor  and  (2)  the  degrees  of  freedom  in  the 
relation 


[Variance  Estirete]  = 


[SiKJOthing  Spy  Fytor] 
[£)egrees  of  Freedom] 


Roth  correction  factors  depend  on  the  correlation  matrix.  An  algorithm  has 
been  devised  to  estinate  the  correlation  matrix  by  assuming  First  Order  Markov 
correlation.  Problems  with  the  application  of  the  theory  are  discussed  and 
possible  modifications  are  suggested. 


1.  INTRQI^CTION.  In  many  physical  mBasuremants  of  related  quantities  X 
and  Y,  two  conditions  exist.  First,  the  independent  variable  X oan  be  measured 
BO  much  more  accurately  than  the  dependent  variable  Y that  X oan  be  assumed 
exact.  Second,  the  man  and/or  machine  system  which  measures  Y introduces 
oorrslated  noise.  In  one  example,  the  tracking  of  missiles,  X is  time  and  Y 
is  position. 


The  etatistioal  analysis  may  be  complicated  by  a lack  of  knowledge  about 
the  physical  model  describing  the  data.  One  approach  to  this  dilemna  is  to  do 
a least-squares  fit  of  a polynomial  to  a smoothing  span  of  N data  points  in 
order  to  find  a "smoothed"  value  for  the  middle  point  i.  To  analyze  the  (i+l)th 
point,  the  smoothing  span  must  be  shifted  one  point  forward  in  X and  the  least- 
squares  analysis  must  be  repeated.  In  the  example  of  missile  tracking,  a 
quadratic  polynomial  fits  a highly  restricted  physical  situation.  The  quadratic 
description  is  rendered  invalid  by  such  factors  as  air  resistance , changing 
rooket  thrus'ts,  and  stage  separation.  Since  the  correct  physical  description 
is  unknown,  however,  the  quadratic  polynomial  is  normally  used. 


The  theory  presented  below  is  based  on  a polynomial  model  of  degree  n. 

Three  seotions  are  devoted  to  the  theory.  1 

First,  an  algebraic  derivation  yields  values  of  (a)  smoothed  positions 
and  correeponding  derivatives  d™Ve.i/dX"',  (b)  estimates  of  variances  of  I 

d™Yg^^/dX"‘ when  the  noise  correlation  is  not  considered,  and  (c)  correction  1 

factors  to  these  vardanoe  estimates  in  order  to  take  correlation  into  account.  -1 


These  correction  factors  are  functions  of  the  correlation  coefficients  and 
the  number  of  degiees  of  fieedom  in  the  smoothing  span. 

The  second  theory  section  is  a matrix  derivation  which  obtains  <a)  an 
altemate  expression  for  the  polynomial  obtained  used  in  the  first  section  and 
(b)  the  relation  between  the  nutiiber  of  degrees  of  and  the  correlation 

matrix. 

The  tMrd  section  estimates  the  correlation  coefficient  in  the  correlation 
matrix  by  using  a first  Order  Markov  approxmation. 

The  foui'^ih  section  reports  on  difficulties  encountered  in  applying  the 
theory  to  (a)  the  output  of  a white  noise  generator  that  has  had  First  Order 
Markov  correlation  introduced  into  it  and  Cb>  actual  missile  tracking  data. 

The  basic  problem  is  that  the  results  appear  to  depend  on  analysis  variables 
Which  have  no  physical  influence  on  the  correlation  present. 

A brisf  fifth  section  lists  the  priitory  cause  of  the  difficulty  and 
possible  correotive  procedures.  This  information  was  provided  by  the  panel 
at  the  presentation  of  this  problem  to  the  Twenty-Second  Oonference  on  the 
Design  of  Experiments  in  Amy  Research » Development,  and  Testing. 

2.  ALGEBRAIC  RELATION  BETWEEN  VARIANCE  ESTIMATES  FOR  IGNORING  AND 

coNsicetm^g'^oMizm 

This  section  disousses  the  effect  on  the  covariance  of  measuremants , 

C0V(Yi^j,Yi^^.)  . p<i,j,j',w,B>  VAR(Y^>  , <2.1) 

on  the  vorianoe  of  a least-squares  polynomial.  If  the  data's  correlation  is 

either  non-existent  or  ignored,  the  correlation  ooeffioient, 

is  set  equal  to  In  general,  however,  the  measuring  device's  bandwidth 

and  measurement  interval,  uj  and  s,  result  in  p 4 ^jj"*  follcwlng  equations 

trace  the  influence  of  the  data  oorrelation  through  the  moving  arc  smooti'dng 
process. 

The  oaloulation  of  the  smoothed  dependent  variable  does  not  formally 

depend  on  the  correlation  in  the  data,  An  n"^^  degree  polynomial,  Yq... . , is 

constructed  through  N data  points.  The  i point  is  in  the  center  of  this 
smoothing  span  and  j ranges  from  -a  » -(N-D/2  to  a to  locate  Individual 
raeasurements . The  polynomial  is  a sunmatlon  over  orthonormal  function  whioh 
are  defined  by 


k 

Jo 


V*3>  ■ .L 


wheipe  orthonomBlity  determines  the  thus,  the  polynomial  is 
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(2.2) 


4*1 


m 


’'S.l+l  ■ =kt<3>* 


(2.3) 


A least-squaraa  ealoulation  mininlzas 


■ L V • tL 

to  detazndna  the  constants  to  be 

• jL  v-i) ’fi.j  • 


(2.4) 


(2.5) 


The  ni  " derivative  of  Yg^j,  is  obtained  by  m differentiations  of  with 

respect  to  (X^  t sj)  and  then  setting  j equal  to  0(  this  results  in 


• I V ’'i.j 


(2.6) 


Where 


sL  a CiL 

B 


,L  .L  ^ • 


ksm  &a0 


Since  the  df"Yg^^/d)f''  values  are  functions  of  the  Yj^^j  data  through  the 
Aj^(i)  values,  the  errors  in  these  derivatives  are  also  dependent  on  the  errors 
and  correlation  of  the  Y^^^  data.  The  variance  of  d^Vg^^/dJ^  is  calculated 
ft<oin  ekpectation  relations  to  be 


n n 


VAR(<1^5,^/d!^>)  . (ml/."-)*  • »•’> 


The  covariance  of  Aj^(i)  and  Aj^^Ci)  is  found  to  be 
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X 


OOVCA^<i),A^*(i)]  ■ I J {Fj^^(sj') 
j»-a  j'x-o 

Whart  »x*^  By  using 

for 

EC<ifi^j-Yy  li+ j ^ ^ Vj  il+j  ^ * 

tha  oorralatlon-ignored  result  is  found  to  ba 

D(AR(A5,i/d)(l"n„jK.iaMml/s^*VAR«i)  J^(^*  . (!.9) 

Iba  dagreas  of  freedom  used  in  VAR(Yj^)  is  ntl  lass  than  1^  using  tha  ganaral 

axprasaion  p(i,jij',w,s)  VARCY^)  for  the  oorralation-oonsidarad 

result  is  found  and  the  ratio  of  the  correlation-oonsidered  astiinate  of  varianoe 
to  the  ooxralation-ignored  estijnate  is  oaloulatad  to  ba  tha  product  of 


nn 


n n k k'  a 0 a.  i' 

J X X J 

• ^ i“Ci  i «0  i«-g  j a-ot  ^ KX.  K n 


(2.10) 


k*m 


>hare  <j)^  ■ 1 for  j«i«0  and  (j')*  « 1 f or  and 


*’n- 


N-(n+l) 

“TPT“ 


(2.11) 


vhere  T is  tha  "true"  reduction  in  the  degrees  of  freedom  disouasad  in  Section  3 , 

Fbr  segments  of  tha  data  in  vhioh  the  oorralation  ooeffioiant  may  ba 
asBumsd  oonstant  in  i and  synjnstrio  in  j and  j",  the  R„,,(i)  n»y  be  rewritten 

to  expedite  ooirputer  oaloulations . For  j«j',  the  correlation  coefficient  must 
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be  unity*  This  fact  and  Idie  syrnmetr^  about  the  two  diagonals  in  the  array  of 
possible  j and  values  are  used  to  rewrite  the  numerator  of  The 

definition  and  orthonarmillty  of  the  TAit)  functions  si^rplify  the  sum  over 
the  main  diagonal  t the  result  is 


J<*m  [k 


n n k k"  <N-l)/2 

I V 2 I I I I 1 

«n  ]<-sin  )!.■*»  0 j»l 


n 

J 

km 


P<3t-j»WiS> 


n n k k-  (N-D/2  <j-l> 
k«m  k“»m  iSo  ifnO  jei 
,11, 


*'^1111  I I 


Where  » 1 for 


(2.12) 


The  constants  in  the  orthonoxmal  functions  may  be  obtained  by  a bootstrap 
derivation  by  starting  with  C . Tl;e  non-xero  constants  for  subscripts  less 
than  or  equal  to  6 are 


C ■ J 1/W 
00 


■>/  12/N(N*-1)  , 

C « ^5<N*-1)/4N(N*-H)  , 

AO 

■ J 180/N(N*-l)<N*-4>  , 


« ^7T3N*-7 ) */N<N*-1KN*-4) , 


* s/  2800/NlN“-ii<N*-4i^N*-9i  , 

C ■ J 81<N^-l)^N^-9i/64N(N*-4T(N^-16y  , 
i»e 


^ 226^3N^-13) VM(N*-1^^N*-4)(N*-9) (Nl^-lBi  , 


« J 44100/N(N^-1)  <N*-4)  (N*-9)'(N*-lBr  , 

c^_ « yiras?TiHoN*+4^^  , 

■ ^'B3900(N*-7 ) »/N<N‘-ir<M*r4> (N*-9) (N^-18> Cn‘-25)  , tnd 

C « J 698S44/NCN*-1)^N^-4^(N*-9)(N*-18>(M*-^^^^^  . (2.18) 

SI  ^ 

Th«  syatematlo  oaoursNmo*  of  xaores  In  th«  table  of  valuei  may  be  used  to 

further  expedite  oonputer  oaloulatlonB . Since  Cj^  * 0 unleai  ]<  <<•  il  Is  event 

eaoh  term  In  the  sums  of  Equation  (2.12)  la  Identically  zero  unless  k t mi 
k"  mi  k '«■  £|  and  k“  t are  all  event 

8.  MATRIX  DERIVATION  OF  NUMBER  OF  DEGREES  OF  FREEDOM. 

The  raw  variance  of  data  analyzed  with  a smoothing  span  of  N points  is 
given  by 

ece/  e.) 

VAR  ■ . (3.1) 

The  numberatori  E(E,  E^)i  is  the  expectation  value  for  the  sum  of  the  squares 

of  the  dlifferenoas  between  data  values  and  oorraspcndlng  smoothed  or  filtered 
values.  The  denominator i N-T|  is  called  the  nunber  of  degrees  of  freedom.  The 
reduction  in  the  degrees  of  fraedoni  Ti  is  dependent  on  the  oorrelation  of  the 
data  in  the  snoothing  span.  For  zero  correlation i T is  one  more  than  the  degree 
of  the  polynomial  used  for  smoothing.  This  corresponds  to  the  number  of  constants 
in  the  ralynomlal.  For  the  total  correlation.  T ie  equal  to  N.  In  this  case, 
the  variance  is  urdefiiMd.  The  following  derivation  yields  a description  of 
the  degrees  of  freedom  for  intermediate  correlations. 

The  dependent  variables i Y^i  may  be  arranged  in  N by  1 matrices.  Eaoh  of 

these  column  matricss  are  related  to  the  independent  variables  i polynomial 
ooeff ioients  I and  random  errors  by 

Y - XB  + E-  . (8.2) 

The  rth  row  of  the  random  error  column  matrix i E_i  contains  the  error,  c„,  of 
the  rth  dependent  variable,  Y_.  The  nth  degree  polynomial  coefficients,  B., 

r n 

. . . , Bq,  are  in  the  n+1  by  1 column  matrix  B.  The  N by  n+1  matrix  X may 
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ba  eonsidered  as  a oompoaite  of  n+1  eoluran  matrices,  ... , Xq.  The 
rtti  row  of  taGh  X^  oontaina  the  oth  power  of  the  independent  vaupieble  Xp  that 
oQKveapnde  to  the  dependent  variable  Y^.  The  enoothed  or  filtered  dependant 
variablee  era  given  the  independent  veriablee,  Xi  end  estimates  of  the 
polynomial  eoeffloienta,  B,  b<; 

r-55  • «.3) 


n 


A 

A leaat-squaree  oaloulation  may  be  used  to  find  B. 
for  the  deviations, 


Y - Y 
r r 


I 


The  suiTtnation  of  equares 

0.4) 


is  given  by 


0.6) 


f 

\ 


\ 


Where  is  the  row  matrix  which  is  the  transpose  of  the  oolum  matrix  E^ 
containing  the  e^'s.  Substitution  of  E^  ■ Y-Y  « Y-XB,  differentiation  of  the 
sum  of  squares  with  respeot  to  and  setting  the  result  equal  to  zero  yields 

0 ■ C-Y^XL  + bVxi_]  + C-Y^XL  + SVxLf  0.6) 

vf\ere  is  a column  matrix  defined  in  terrmi  of  Kronioher  delta  fUnotions, 

■ 0 if  and  ■ 1,  by 

0.7) 


Sinoe  each  of  the  two  terms  in  Equation  (3.6)  are  soalars  (i.e.,  1 by  1 matrioes) 
and  the  second  is  the  transpose  of  the  first,  the  two  temts  ore  equal.  Thus, 
Equation  (3.6)  siJiplifiea  to 

0 - 2C-y'''xI_  + sVxi  ] . (3.8) 

% IW  iV%r  * ' 


®n;l,l 

*o: 
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This  further  sijiplifies  to 

i • <3.9> 

mt  m ^ A*  *• 

Wh<ur«  the  Buperscrlpt  Is  the  standard  notation  for  Inverse • This  result 
utilizes  the  raw  data  to  estimate  polynomial  ooef fioients • 

The  raw  varianoe  of  the  data  within  the  smoothing  span  is  fbund  by  relating 

the  expeotation  values  of  £,'**£-,  and  The  first  is  estimated  by 

the  sum  of  the  squares  of  the  second  is  the  produot  of  M and  the  desired 

vorlanoe  denoted  by  a* , and  the  third  is  the  produot  of  the  correlation  matrix, 
Vi  and  o*.  Substitution  of  Equations  (3.3)  (3.9)  and  (3.2)  Into  the  definition 

E.  ■ Y-y  yields 

E.  - Cl  - X(x‘'’x)“‘x'^]  E.  (3.10) 

•mW  m m t tm  wC 

Whore  I is  the  standard  unit  matrix  whose  elements  are  defined  by 
and  (X^X)’^  is  the  inverse  of  (x'^X)  defined  suoh  that 

t ^ a#  % 


(x''’x)"‘(x’''x)  ■ (x'^'xicx'^'x)-^  ■ I . 

Equation  (3.10)  leads  immadiately  to 

- X<x"'’x)-‘x‘’‘3  E.  . (8.11) 

Ta)dng  the  expeotatlon  value  of  Equation  (3.11)  yields 
‘ ■ E{E,V)  - E{E,^X(x''’x>'>xV}  . 

atV  aiv  <wG  •wC  a*G  a*  m % as  mE 

The  first  term  on  the  rijjht  is  just  No*.  The  last  term  may  be  slncllfed  by 
noting  that  the  qi»ntity  in  braoes  is  a 1 by  1 matrix,  replaoing  this  slnple 
matrix  by  its  trace , and  using  the  identify 

Traoa  (ABC)  ■ Trace  (BCA)  ■ Trace  (CAB) 

•mmrnt  w^a«  iwiWM 

Further  simpllfloatlon  Is  made  by  interchanging  the  order  of  s)9eatatlon  and 
traoe  operations  and  finally  by  making  the  usual  assuiption  that  the  mBasuremsnts 
in  X are  exact  so 

a* 

ECf(X){E,E/)3  » f(X)E(E^E/>  . 

«w  aiEiwE  «v  %EaiG 


Tho  result  is 


E(E.'e.)  * Na*  - Traae 

•fV  n*9  iw  % M*  m miEwC 


Ihe  use  of  E(E-E_'>  ■ Vs*  yields 
E(E  V) 

w9  t«9 


N-Traee(X(x’''x)“*x'v> 


(3.12) 


(8.13) 


Equation  (3.13)  Is  Qumberaoma  because  the  tmoe  is  performed  ort  an  Nf  by  H matrix. 
Traoe  algebra  converts  the  quantity  inside  t^e  braces  to  a n-t-l  by  n'<‘l  matrix. 

Estimation  of  ECE^'^E.)  by  E.'^E.  then  yields  an  estimate  of  the  raw  variance  to  be 


N-Trace{(x'x>“‘x\x) 


(3.14) 


The  evaluation  of  the  effective  degrees  of  freedom,  l.e.,  the  denominator 
of  Equation  (3.14)  is  dependent  on  the  data  through  V and  the  smoothing  process 

through  X end  N.  The  X matrix  Is  given  in  terms  of  Xj  ■ js  * s^  Where  x^  is 

the  mid-point  of  the  smoothing  span  and  s Is  the  measurement  Interval.  The 
general  form  Is 


2 " *^0+1 


^-a+1 


(3.16) 


whore  0 Is  defined  by  (N-D/2.  Although  X depends  on  s and  x^,  the  degrees  of 

freedom  do  not.  The  Independent  variable's  Inoremente,  s,  has  no  effect  because 
it  does  not  effect  either  the  variance  or  the  sum  of  the  squares  of  the  deviatlona. 
Tho  midpoint  of  the  Independent  vak'iable  seffiant,  x^,  has  no  effect  under  the 

neoessary  assumption  that  V matrix  describes  the  correlation  In  all  segmants 

oonsldored.  For  Qong)Utational  ease,  s and  x^  may  be  set' equal  to  1 and  0 for 


133 


I*;"  > 

fe-  i i 


th«  dagreoB  of  frtadom  oaloulatlon.  This  sinpllfies  Xj  to  Xj  » j.  Th« 
MloulAtlon  of  tht  n't'l  ntl  matrixi  is  further  siiiplifled  if  V is 
deiet'jiied  with  a AiAitle  Mirkov  oonatant  p by  The  other  n+1  by 

natrlXi  <X  X}~*»  may  be  obtained  ^ either  analytioally  or  oonputationally 
' finding  the  Inveree  of  ^ 


.2h 

'j 

i-f'' 

,2n-l 

'j 

• 

... 

a 

5 

s 

... 

M.  If 


(3.16) 


' ]j^  •••  w ••• 

where  all  lunnetione  are  over  the  range  -a<;}<a.  The  aunnatlona  over  powera  of  j 
my  be  found  with  either  a ooniputer  or  a mtKcmtlos  handbook. 

If  one  deilrea  an  expllolt  equation  for  tha  degrees  o*  ft«edomi  the 
porooedure  of  the  above  paragraph  oan  be  done  analytioally.  The  results  for 
nsOi  nail  and  n«2  arsi  reBpeotlvelyt 

ter  .■  N . wir  tranr  H,>,  + H„'''vhJ  , and  (J.ll) 


feir  sJiSi  ♦ So\a  I • 

we  ■■■ 


DOP^  ■ N - 


, Og=^31 


ISLitSSb 


Where  Hj  is  given  by 
(-0)^ 

H,  ■ (-0+1)^ 

«J 

I 

I 

<a)^ 


^ Sola 

■w 


(3.19) 


(3.20) 
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with  th«  undaratanding  that  CO]*  « 1.  For  aaoh  of  thasa  three  equations, 
substitution  of  V ■ MaHa'''  yields  zero.  This  siirply  states  that  totally 

^ ««W«W 

oorvelatad  data  has  zero  degrees  of  ftreedom. 

If  the  identify  matrix  is  used  for  V in  Equations  (3il7),  (3.18),  and 
(3.19),  the  results  are  DOF^  ■ N-1,  DOF  . • and  DOF^  • N-3.  This  eheoks 
with  DOF^  ■ H-(n't'l),  i.e. , the  nunber  o:^  degrees  of  freedom  equals  the  nuniber 
of  points  in  the  smoothing  span  minus  the  number  of  oonstants  in  the  pelynondal. 

4.  ESTimTE  OF  CORREIATION  COEFnCiaTTS. 

ee«  - - - ""  ti  — -iii 

The  oovarianoe  of  two  raw  data  points,  and  is  related  to 

their  oorrelation  aoeffioient  and  the  variance  of  the  points  in  the  range 
i-a  < ( j or  j -)  < i+o  by 

Iha  puudo-dAvlation.  art  dafinad  by 
•i+J  ■ ^i+j  “ ^8il+j 

vhere  is  not  the  true  mean  which  would  yield  the  true  deviations ( instead, 

it  is  the  kth  degree  polynomlally  smoot>»d  value  from  the  operation 

The  a^'s  are  restricted  by 

J ■ 1 , 

p«-a  ^ 

and  are  defined  by  Equation  (2.6)  with  j ■ p and  m ■ 0. 

By  using  two  fast  Fourier  transfuxms  and  associated  manipulations, 

OOV(.i*j,  e^^jia),  may  be  obtained.  The  needed  quantities,  hoover,  are  either 

00V(Yi^j,  or  VAR(Y£).  Unfortunately,  these  cannot  be 

obtained  without  applying  constt'aints.  Presented  bclcw  is  a method  of 
determining  one  VAR(Y^)  assuming  that  is  a kth  degree  polynomial 

with  additive  First  Order  Markov  error. 
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By  daflnlng  ^ psaudo-davlatlonB  may  ba  found  from 


By  uflng 


■ ? b - 0 . 

p«-a  ^ 

tha  aHpaotation  valua  of  ba  ahown  to  aqual  both  COVCa^^ji 


pa.Q  qu-oi 


Thaaa  raiults  and  tha  uaa  of  Equation  (4.1)  laada  to  an  axpraiilon» 

Vj'>  ■ '**'V  J.,  , (* 

vihldh  ralatai  tha  knoi«:i  ptaudo-davlation  oovarlanoas  to  tha  dailrad  raw  data 
oesvalation  faotora.  Thla  aquation  oannot  ba  aolvad  for  bowavar, 

baeauia  'tha  doubla  aumnatlon  la  ovar  (2oH'l)*  texma.  In  ordar  to  olroumvant 
thla  problam  of  having  nora  unknowna  than  aquatlona , It  la  oonvanlant  to 
mathamatioally  modal  tha  ootvalatlon  factor. 

Iha  Flrat  Ordar  Markov  axror  In  tha  14>1  point,  la  givan  In  taxma 

of  a alngla  Markov  eonatant,  p,  'tha  arror  of  tha  1 point,  Bj,  and  a random 
varlabla,  by 

*!♦!  ■ P®1  * ^1+1  • 

Rilatlng  axpaotatlon  valuaa  of  ^ for  all  valuaa  of  j itoy  ba  uaad  to 

OMpraaa  tha  oorralatlon  ooaffloi*  nt  p j ^ j ^ aa 


p,,. . IH1  . 


I 


?y  using  Equation  (4.7)  and  defining  an  index  ^ “ j'  - j,  Equation  <4.S) 
beooiMB 

*4+4-^  “ 5 ? ^ 

^ ^ J p«-a  q»-tt  ^ 

This  set  of  equations  has  only  two  unknowns,  p aivi  VAR(Yj).  The  straight- 
forwsrd  approach  would  be  to  define  a deviation  by  ^ 

A a 00V<a.,4,  - VAR<Yi>  f ? b.  b^  , <4.S 

p»-a  q«-a  ^ ^ 

oaloulate  a sum  of  aquares  by 
2a 

S B J A*  (4.] 

^«-20 

and  find  the  values  of  ^ and  that  simultaneously  satisfy 


p«(5,  VARCY^)  ■ ^<Y^> 


° ■ mRTTjr 


pe0,  VARCYj_)  ■ A(Y^) 


Uhfortunately,  the  direot  procedure  is  algebraically  intractable.  An 
alternate  approad)  is  to  first  perform  a oolculation  of  Equation  (4.12)  and 

find  the  w(Y^)  as  a function  of  p to  be 


■ 4fl2c 


(4.13) 
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and  second,  define  a new  deviation  as  a function  of  p only  as 

L'  5 C0V(e^^j,e^^.j  J - H pl4>+q-Pl  (4.14) 

and  gnaphinally  find  $ as  the  value  of  p which  minimi  2es 

S'  £ 7 A'®  (4.15) 


where  i|i  is  still  bounded  by  -2o  < p < 2a.  In  this  graphical  procedure  S'*  is 
of  course  a function  of  p . The  saving  restriction  which  makes  the  procedure 
tractable  is  that  p is  bounded  by  -1  < p <_  +1.  The  conputation  work  is  still 
considerable i however,  so  it  is  worttw^le  to  use  the  invariance  under  change 

in  sign  of  (Ji  of  C0V<e^^j,  ®i+j+(j>^*  ^ ^ 

* P Q 


5.  NUKEMCAL  RESULTS.  Twe  sets  of  numbers  have  been  analyzed  in  order  to 
aetermine  the  usefulness  of  the  theory  in  the  last  three  sections.  The  first 
set  has  been  generated  by  using  a random  noise  generator  an<i  introducing  First 
Order  I^kov  conreiation  of  Joiown  p.  The  second  set  is  from  a missile  versus 
drone  test  at  White  Sands  Missile  Range. 


' The  generated  numbers  do  not  lead  to  conplately  desirable  resuJts  from  the 
analysis.  Table  I shows  the  innut  and  one  set  of  output  of  the  conputerized 
equations  from  the  last  three  oect5  ns . For  large  values  of  p , the  two  resulting 
variance  estinates  agree  witii  each  other  but  diverge  considerably  from  the  input 
variance.  The  basic  discrepancy  occurs  in  the  output  p. 

Comparison  of  the  left  and  right  columns  or  ToLle  I shows  deviations  for 
all  values  of  p.  Table  II  shows  sample  output  of  p's  for  mangas  of  smoothing 
span  N and  polynomial  degree  n.  Since  the  output  average  is  0.67  ± 0.07  when 
the  input  is  0.5,  and  0.17  ± 0.05  when  the  input  is  0.2,  it  appears  tliat  the 
problem  is  in  the  variability  of  the  output. 

Analyzing  data  from  missile  versus  drone  missions  displays  more  variability 
of  the  output.  Table  III  shows  the  results  of  varying  smoothing  span  and/or 
polynomial  degree  on  missile  position  date . The  resulting  output  p varies  in 
an  unsystematic  manner.  A further  lack  of  unifonraty  is  showr.  in  Table  IV.  The 
drone,  which  the  missile  of  Table  III  was  attacking,  was  airborne  for  sufficient 
time  "CO  analyze  ei^ght  successive  segnents  of  256  data  points.  The  variation  in 
output  p between  segments  is  evident;  but  again  there  is  no  evident  system  of 
variation.  A final  illustration  of  the  non-uniformity  of  the  output  p is  shown 
in  Table  V.  The  Cartesian  coordinates  of  Table  IV  were  calculated  from  azinuths 
and  elevations  measured  with  several  cinetheodolites . Table  V shows  the  averages 
and  variance  estimates  of  five  elevation  output  p's  from  one  cinetheodolite. 
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6.  POSSIBLE  CAUSES  OF  DIFFICULTIES.  The  panel  at  the  Twenty-Second 
Conference  on  the  Design  of  Experiments  in  Amy  Research,  Development,  and 

Testing  made  some  comments  on  this  problem. 

First,  the  use  of  jiolynomiala  was  seriously  questioned.  The  fluxuation 
in  oaiculated  p should  not  occur  if  the  mathematical  model  fits  the  physical 
situation.  Since  the  form  of  the  equation  for  missile  trajectories  is  unknown 
except  in  idealized  circumstances,  a parameter  free  approach  was  suggested. 

Second,  if  polynomials  must  be  used  to  compare  with  o\xrrmt  correlation- 
ignored  results  using  quadratics,  it  was  suggested  that  the  sum  of  squares  of 
deviations  should  not  be  minimized i instead  of  deviation,  the  deviation  divided 
by  the  square  root  of  a previous  estimate  of  the  variance  should  be  used.  This 
procedure,  which  vould  change  both  the  position  estimates  and  its  variance 
estimates,  should  be  iterated  until  tha  position  eatinates  stablize. 

Third,  since  the  path  of  an  object  depends  on  previous  position,  velocity, 
and  acceleration  of  the  object  and  not  on  future  values,  it  was  suggested  that 
estimates  of  position  and  variance  should  be  determined  from  the  forward  time 
end  of  the  smoothing  span,  instead  of  its  midpoint,  estimate  position  and 
variance . 
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SHOOTHIHQ 

SP/^ 


rnsLe  II 

OUTPUT  p 
n * 2 OP  3 


n ■ 4 or  6 


n s 0 or  1 


TABLE  III 
MISSILE  ANALYSIS 


SM50IHINQ 

SPAN 

OUTPUT  p 

n * 0 or  1 n ■ 2 or  3 

n * 4 br  B 

7 

.02 

.20 

.99 

11 

.00 

.09 

.30 

15 

.10 

.04 

.09 

21 

.35 

.05 

.07 

31 

.58 

.10 

.07 

OVrPUT  FOR  n » 

TABLE  IV 
DRONE  ANALYSIS 
2 rar  3 AND  SMOOTHING  SPAN  « 

15 

SEGMENT 

X 

y 

^ z 

AVERAGE 

1 

.38 

.80 

.59 

.59 

2 

.46 

.34 

.48 

.43 

3 

.67 

.31 

,69 

.49 

4 

.69 

.69 

.50 

.69 

5 

,86 

.80 

.68 

.78 

6 

1.00 

.43 

.52 

.66 

7 

1.00 

1.00 

1.00 

1.00 

8 

06 

-.06 

.25 

.05 

AVERAGE 

.64 

.64 

.58 

.59 

TABLE  V 

OUTPUT  p FOR  ELEVATION  OP  DRONE  !ROM 
nVE  SUCCESSIVE  SEGMENTS  ON  ONE  HIM 

n » 

0 or  .1 

n « 2 or  3 

n » 4 or  6 

.66 

t .27 

1 34  1 1 39 

-.08  ± .62 

14? 


ROBUST  OUTLIER  DETECTION  IN  TRAJECTORY  DATA  REDUCTION 

WlllUtm  S.  Ag««  and  Robart  H.  Turnir 
Analysis  and  Ceiiftputatlon  Division 
National  Range  Operations  Directorate 
US  Anny  White  Sands  Missile  Range 
White  Sands  Missile  Range i NM  6B002 


ABSTRACT.  A data  reduction  program  at  White  Sands  Missile  Range  that 
often  nas  an  hour  of  flight  time  is  called  the  Multiple  Radar  Tracking 
System  (MRTS).  Undetected  outliers  destroy  automated  data  reduction 
causing  a significant  number  of  reruns  with  human  detection  of  these  out- 
liers. The  procedure  described  in  this  paper  enables  the  MRTS  to  reduce 
large  quantities  of  radar  data  with  very  little  chance  of  being  Influen- 
ced or  ruined  by  outliers. 

Outliers  are  detected  by  examining  residuals  from  a least  squares 
estimation.  Three  robust  methods  of  estimation  which  are  Insensitive  to 
outliers  are  described.  The  masking  affect  is  almost  nonexistent  in 
these  methods. 

1*  INTRODUCTION.  An  entire  trajectory  of  Cartesian  position  velo- 
city and  Mceleration  data  is  produced  from  radar  Crangsi  aximuthi  and 
elevation)  data  by  the  Multiple  Radar  Trajectory  System  (MRTS).  the  MRTS 
consists  of  four  distinct  areas: 

a.  Data  gathered  from  several  sources  are  merged  onto  one  file  after 
being  calibrated  and  time  corrected. 

b.  A preprocessor  allmlnates  outliers  and  computes  initial  observa- 
tion variances  and  initial  Xi  Y,  Z positions.  The  robust  outlier  detect- 
or is  In  this  stage. 

c.  A batch  processor  produces  the  entire  trajectory  simultaneously 
from  all  observations  (except  outliers). 

d.  A fixed  lag  optimal  smoother  then  produces  smoothed  positional 
velocities  and  accelerations. 

The  remainder  of  this  paper  is  about  the  preprocessor  stage.  As  the 

□ram  is  at  present,  whenever  outliers  are  found  they  are  discarded 
ead  of  being  deweighted. 

In  order  to  detect  outliers  an  examination  of  residuals  should  be 
made.  But  those  residuals  mutt  not  come  from  an  estimation  of  the  ob- 
servation process  that  is  influenced  by  the  outliers.  Three  estimation 
schemes  are  described  which  are  resistant  to  outliers.  Two  methods  of 
examining  the  residuals  for  outlying  observations  are  described.  The  use 
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iitiiiiii 


II 
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ef  th«  outlltr  railitant  iit Unit ion  and  rtildual  oxanilnatlen  make  up  the 
robuit  outlier  detector  uied  In  the  preproceiaor  itage  of  the  MRTS. 

2.  mLllkMmmrtmmm.  The  obiervitlon  model  Is 

■ Oq  + a^t^  + a2t^*  + 1 ■ 1,  n 

The  three  methoda  described  are  called:  , . 

Si  Least  squares  with  robust  weights* 

b.  Brown-Mood*  and 

G.  Thai  1-Sen. 

The  first  one  Is  used  In  the  MRTS. 

Least  Squares  with  Robust  Weights.  The  median  of  the  observations  x* 
and  Its  respective  time  t*  are  found.  For  each  observation  compute 


‘ Tl'j-t*]'" 

Solve  for  the 

■ (Igiliil2)' 
by  minimising 

,1,  "1 

where 

I -4 

j. 


Brown-Mood.  The  following  steps  show  the  Iterative  process  for  slope 
and  curvature  coafflclanti : 

a.  Initialize 

jjW.sJoi.o 

t*  ■ ned  (t^) 

t'*'  ■ med  (t^>t*) 

t*  ■ med  (t^<t*) 

J *0 

b.  Find  median  residual  In  each  ha1i 
x'*'  ■ med  (x^-ij^H^-i^^^tf) 


t^>t* 


x“  ■ med  (x.-li^^t.-llJ^t?) 
t^<t*  1 1 1 z 1 

c.  Update  coefficients 
,jM) 

The  relaxation  factor  of  1/2  seams  to  provide  faster  and  more  stable  eon* 
vergenee. 

d.  Repeat  steps  two  and  three  until  convergence,  then  compute  the 
Intercept  coefficient 


r*2 


divli^d  dl 

without  dupll cation 


jl.  This  method  Is  not  Ittratlvt  but  It  does  raqulra  many 
inanois  be  taken.  First,  all  the  divided  dlfferehoei  dj.i 


X^-Xj 

“j,1  “ J ^ ^ 

To  compute  all  possible  divided  differences  of  the  djj  would  take  too 
much  time  and  space.  Instead  a smaller  number  of  divided  differences 
which  represent  the  dj,i  well  Is  computed 

*1+2s“*1 


for 

1 ■ 1,  n-2* 

A ■ 1 1 [n/3] 

Let 

S2  " med  (e(1.1'^A.1'«‘2A}} 
New 


•'j.i  "Tpl"  *1  * *«  ‘V*1> 

ilnee 

*1  “ *0  ^ *1^1  "**  *2^1 
Let 

■ med  (dj  j-aglt^+tj)) 
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snd  finally 


4q  ■ mad 


3.  ;;fiy 
•nd  Moort 
In  thi  MRT 


aaieribad. 


A Grubb's-typa  statistic  pndposad  by  Teitjan 
A modlflad  version  of  this  stitlstic  Is  used 


, jwifiiwiinniiMUTiiTi ij.'iijjiTiiiiPiiiaiiiiTiiif  All  rai Idual s are  ordered  by  absolute  values< 
We  msKt  a chanye  of  variable  names  so  that  the  r's  correspond  to  the  ob- 
servations 

Uilil^ali- iiz„l 


2,  ■ •••  . 2„  ■ 

After  finding  the  largest  gap 

lIVwl  - |Z„-kl> 


compute  the  test  statistic 


E„(n) 


(z,*D* 


where 


end 


n-k 

,1,  h 

V „ 1"1 

^k  “ Yt" 


T.  ±±J. 


If  E|((n)  Is  smaller  than  the  desired  critical  valusi  we  conclude  that 
these  k most  extreme  residuals  correspond  to  outlying  observations. 
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i^QdlfitdJirubbs  Tvpa  statlitici.  This  is  the  same  as  previously  des- 
erlbed  except  for  the  denominator  of  the  test  statistic  and  the  critical 
value  selected.  Instead  of  testing  for  k outliers  In  n samples « we  test 
for  one  outlier  In  n-k*)*!  samples.  Me  compute 


I »iV 


(J 


■*k*l'* 


Where 


and 


“•k+l 


n-k 


n-k+l 

I Z. 


If  El (n-k+l)  is  smaller  than  the  desired  critical  value,  we  conclude  that 
the  k most  extreme  of  the  n residuals  correspond  to  outlying  observations. 

3.  EXAMPLES.  The  three  previously  described  estimation  procedures 
and  an  unweignced  least  squares  were  applied  to  four  sets  of  real  data. 
The  original  sets  of  data  and  residuals  from  each  estimation  are  listed. 

Wle  1 - a set  of  16  observations  where  the  last  two  are  outliers 
shTo®)! 


OBSERVATIONS 

LEAST  SQUARES 
W/RQiUSt_MTi._ 

BROWN-MOOD 

THE.IL-_SEN  LI 

■AST  SQUARES 

1. 

-.0061 

-658 

-74 

1 

-n 

-3829 

2. 

-.0046 

-261 

-47 

-6 

-6479 

3. 

...0044 

24 

-22 

0 

-7532 

4. 

-.0041 

267 

0 

6 

-6991 

14a 


QBSERVATlQNi 

LEAST  SQUARES 

H/A08y.§IJI5, 

BROMN-MOOD 

s. 

-.0037 

418 

20 

11 

-4854 

6. 

■,0033 

667 

87 

67 

-1071 

7. 

■ , 5033 

524 

51 

22 

4207 

8. 

-.0027 

419 

13 

-22 

11081 

9. 

-.0023 

342 

72 

33 

19650 

10. 

-.0021 

-6 

-71 

-111 

29615 

11. 

-.0017 

-227 

-17 

-86 

41375 

12. 

-.0013 

-620 

34 

0 

54731 

13. 

-.0010 

-985 

-17 

-44 

69583 

14. 

-.0006 

-1421 

29 

11 

86129 

15. 

-.9690 

-960730 

-958727 

-958733 

-854528 

16. 

.4451 

442389 

445014 

445022 

668910 

Bxamplt  2 - a lat  of  1S  obaarvatloni  whara  tha  third,  fourth  and 
fiftn  ara  outllari  (raildgolixlO*); 


OBSERVATIONS 

LEAST  SQUARES 
W/ROBUST  WTS 

BROWN-MOOD 

IMcM 

1. 

.21709 

-1611 

-444 

-136 

-332222 

2. 

.21824 

-1497 

•313 

-87 

-314194 

3. 

.96519 

734413 

735591 

735744 

441640 

4. 

.94511 

723287 

724437 

724529 

462449 

8. 

.93499 

712116 

713216 

713256 

465224 

6. 

.22288 

-1061 

-24 

-23 

-221986 

7. 

.22406 

-943 

-n 

-39 

-193910 

8. 

.22530 

-760 

54 

9 

-163748 

9. 

.22652 

-612 

61 

9 

-131611 

10. 

.22770 

-510 

0 

-47 

-97508 

11. 

.22900 

-293 

32 

0 

-61280 

12. 

.23028 

-101 

15 

10 

-23066 

13. 

.23185 

75 

-39 

7 

17144 

14. 

.23266 

286 

-81 

0 

59399 

15. 

.23418 

502 

-140 

0 

103670 

txamBla  a - a lat  of  18  obaarvatloni  whara  tha  twalfth.  thlrtaanth. 
and  riftaanth  ara  outllari  (raildualixlO*}: 


OftSIRVATlOtiSL 

LEAST^SqU^GS 

BR0WN-MQQ8 

THEIL-SM 

LEASIJOUARES 

1. 

-1.70987 

-3369 

-699 

9 

-157774 

2. 

-1.70942 

•667 

387 

0 

-204 

3. 

-1.70893 

991 

225 

12 

105480 
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.iilfUt-aiSU 


OiSERVATIONS 

LEAST  SQUARES 
W/8&8UiT.WIS_ 

BROWN-MOOD 

THEIL-SIR 

LiAST  SQUARES 

4. 

-1.70846 

2166 

61 

-6 

159227 

S. 

-1.70793 

2708 

-54 

0 

161087 

6. 

-1.70741 

2576 

-159 

-14 

111021 

7. 

-1.70682 

1841. 

-186 

23 

9099 

M, 

-1.70626 

402 

-233 

12 

-144780 

9. 

-1.70671 

-1721 

-282 

-28 

-360595 

10. 

-1.70610 

-4466 

-262 

-28 

-608277 

n. 

-1.70449 

-7866 

-232 

-45 

-917885 

12. 

1.43777 

3129701 

3141456 

3141568 

1862231 

13. 

1.44602 

3132585 

3149144 

3149153 

1456410 

14. 

-1.70267 

-22044 

0 

-121 

-2168177 

15. 

1.44667 

3120482 

3148695 

3148416 

473139 

ExaitiPlt  4 ■»  a let  of  21  obsarvatlons  whart  the  seventh,  twentieth, 
and  twenty- first  are  outliers.  This  example  Illustrates  dropped  sign 
bits  end  zeroed  data  (resldualsxio*): 


LEAST  SQUARES 


mmm 

W/ ROBUST  WTS 

BROWN -MOOD 

THEIL-SEN 

LEAST  SQUARES 

1. 

-.00988 

•123 

-423 

-248 

-4433 

2. 

-.00995 

-88 

-337 

-178 

-2839 

3. 

-.00976 

212 

9 

154 

-1203 

4. 

-.01017 

-83 

-243 

-114 

-386 

5. 

-.01016 

47 

-76 

39 

632 

6. 

-.01023 

102 

14 

112 

1361 

7. 

.0 

10463 

10404 

10487 

12152 

8. 

-.01047 

129 

96 

162 

2034 

9. 

-.01083 

-90 

•103 

-52 

1807 

10. 

-.01089 

-4 

0 

36 

1661 

11. 

-.01089 

147 

164 

182 

1356 

12. 

-.01121 

-16 

8 

11 

613 

13. 

-.01143 

-75 

-46 

-60 

-449 

14. 

-.01162 

2 

30 

0 

-1600 

15. 

-.01185 

-156 

-133 

-179 

-3010 

16. 

-.01200 

-128 

-114 

-178 

-4558 

17. 

-.01206 

-6 

•5 

-85 

-6236 

18. 

-.01241 

-168 

-186 

-282 

•8422 

19. 

-.01239 

45 

5 

• 108 

10467 

20. 

.01215 

24783 

24717 

24587 

11810 

21. 

.01301 

25846 

25750 

25603 

10177 

ISO 


4. 

a.  Laast  Square!  with  Robust  Weights: 

(1)  Almost  always  can  produce  residuals  which  reveal  up  to  half  the 
sample  to  be  outliers , 

(2)  Is  the  fastest  of  the  three  estimators  described,  and 

(3)  May  be  improved  with  other  choices  for  weights  and  Iteration. 

b.  Brown-Mood  Estimator: 

(1)  Has  unknown  convergence  properties  and 

(2)  May  not  work  If  too  many  outliers  are  in  one  half. 

c.  The1l-Son  Estimator: 

(1)  Has  robust  coefficient  estimates, 

(2)  Is  slowest  and  simplest  of  the  three  estimators  described,  and 

(3)  May  be  made  more  efficient  by  taking  advantage  of  equally  spaced 
data  and  for  other  schemes  of  selecting  divided  differences. 

d.  Qrubbs'Type  Statistic: 

(1)  Has  no  masking  effect, 

(2)  Is  fast  and  easy  to  use,  and 

(3)  Could  use  2d  difference  criteria  to  determine  which  k residuals 
to  be  tested. 

e.  Modified  Qrubbs-Type  Statistic: 

(1)  Simplifies  table  look-up  and 

(2)  Detects  same  outliers  as  the  Qrubbs'-type  statistic  on  all 
samples  tried  so  far. 
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TABLE  LOOK-UP  AND  INTERPOLATION  FOR  A NORMAL 
RANDOM  NUMBER  GENERATOR 


WlllfA'tn  L.  Shepherd  and  John  N.  Hynes 
Systems  Manageinerit  Division 
Instnimentatlon  Directorate 
, US  Army  Wltlte  Sands  Missile  Range 
White  Sands  Missile  Range.  New  Mexico  88002 

ABSTRACT.  A normal  random  number  generator  using  table  look-up  and  inter- 
poTaWon  for  the  Inverse  normal  distribution  function  Is  presented  and 
compared  tc  one  where  the  inverse  function  Is  computed  from  a comiionly 
used  formula. 

1.  INTRODUCTION.  In  Monte  Carlo  problems  and  In  simulations  of  noisy 
measurements,  the  cost  effectiveness  of  the  required  normal  pseudo-random 
number  generators  Is  still  of  some  economic  Importance.  We  present  and 
compare  two  such  generators.  One  of  them  Is  available  on  the  Uni  vac  1108 
computer  at  White  Sands  Missile  Range  (WSMR);  the  other  Is  the  main  subject 
of  this  report. 


(M) 


and  {y}  be  the  output  (sequence)  generated  by  a uniform  random  number 
generator  with  density  function  equal  to  1 over  the  Interval  [0,  Ij  and 

0 elsewhere.  Then  {P"'’(y)>  can  be  thought  of  as  the  output  of  an  n(0,  1) 
random  number  generator  [1,  p.  950].  As  mentioned  in  [IJ*  the  principal 

difficulty  In  using  this  principle  Is  in  the  computation  of  P]|'(y).  In 

one  of  the  normal  random  number  generators  In  use  at  WSMR,  P (y)  Is 
computed  by  the  formulas 

P“'J(y)  « - p‘^(l  - y)  for  -J-  < y < 1 . (2.2) 


a„  a.n  + a«" 
2 ] 2 , 

1 + b^n  + bg^  + bjn' 


for  0 < y 1 , 


- /w 

» 2.515517 

bi  . 

1.432788 

> .802863 

•"a" 

.189269 

- .010328 

‘»3" 

.001038 

with  «rror  less  than  4.5  x 10“^.  (This  formula  is  also  given  in  [1]  and 
[2].)  We  refer  to  this  generator  as  Generator  A. 

In  the  following  sections,  we  describe  another  approximation  to  P"^(x), 
referred  to  as  Generator  B. 

3.  A SPLINE  APPROXIMATION  TO  P“'*(vl.  First,  consider 
g(y)  “ g(a)  + g' (a)(y  - a)  + e(y  - a)^  for  a < y < 

“ g(b)  + g'(b)(y  - b)  + y(y  - b)*  for  < y < b . ( 

Set  h ■ b - a,  y ■ , 

c ■ (g(b)  - g(a))  - “21^  (3g'(a)  + g'(b))  , (3.1 

Y ■ - (g(b)  - g(a))  + -jfjT  (3g'(b)  + g'(a))  . (3.: 

With  some  laborious  manipulation,  it  can  be  verified  that 

g(y")  - g(y‘‘‘)  , g'(y-)  - g' (y+)  . (3.- 

g(y)  is  a quadratic  spline,  with  knots  (a,  y,  b),  on  la,  b],  which  inter- 
polates locally  between  (a,  g(a))  and  (b,  g(b)). 


In  order  to  use  (3.9),  (3.10)  for  a normal  random  number  generator,  we  need 
a suitable  table  and  a means  for  computing  P(t)  for  t2|^  < t < 1. 

Define 

I jg  - P""'|  l(j^b)  " ® • b<l 

u 

Me  need  a t2f|  close  to  1 and  a sequence  <t2.|)!|\Q  with  a small  N so  that 
•• 

II®  ‘ ll(l/2,t2Hj)  ® * prescribed  tolerance  c.  We  used  numerical 

search  not  described  In  detail  here.  It  took  up  much  computer  time  and  1$ 
not  optimal.  Essentially,  we  started  with  t^^  " computed  t2  so  that 


1 l9  “ 

• ‘ *-  ‘2 

1 |g  - 

1 t > t2  1 

then  recursively  determined 

1 |g  - 

’’  ^Il(t2v^)  “ " 

, t2^  <,  t < t24+2 

> c 

* ^ '■  ^21+2  • 

t2|tj  was  determined  empirically  by  stopping  when  t2|^  - t2|^.2 
a prescribed  tolerance  £. 

For  the  computation  of  P'^(y)  in  the  above,  we  used  Newton's  method  for 
solving 

P(y)  - X ■ 0 

for  X,  with 

P(y)  ■ -Jr  (1  + e^i(y/»^))  , 


y ■ I (-1)"  -Tifj 
/iT  n»0 


n+1 


hiTT 
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from  £l]i  and 


P“^'(y) 


P'(x) 




P'(P“^(y)) 

,-x*/2 


1 


Our  numarical  txpirlanca  Indicatad  that  naar  one  tha  knot  spacing  naadad 
to  obtain  thi  raquirad  accuracy  Is  not  feaslbla.  as  N Is  larga.  Ua  now 

prasant  two  methods  for  computing  P“^(t)  for  tgjj  < t < 1. 

Tha  first  mathod  Is  to  usa  tha  approximation  (2.2)  from  Qanarator  A. 

Tha  second  method  Is  to  approximate  P~  (x)  by  a quadratic  spline  with 
knots  {t2^i  * t2|^)i  1>  which  has  tha  same  area  under  tha  curve  as 

does  P"'*(t)  over  each  of  tha  Intervals  [t2|^, 

(A  discussion  Is  given  In  Appendix  A.) 

Table  ?,,  3 gives  the  requisite  coefficients  for  N > 29i  N ■ 89»  respectively. 
LEFT  INTERVAL,  RIGHT  INTERVAL  refer  to  Ctgp  t2i+i3,  tgi+g]* 

tlvely.  The  last  row  shows  1 to  bo  a knot.  The  entries  In  this  row  were 
obtained  according  to  the  equal  area  criterion  and  would  not  bo  used  In  • 
computer  program  where  a rational  approximation  Is  used  for  t2|)j  < t < 1. 

EXAMPLE  FOR  TABLE  2: 

For  1 > 13,  .914602271  < t < .920905352 

and 

g(t)  « 1.41116763  6.76465741(t  - .920906362) 

+ 30.837B448(t  - .920905362)^ 

4.  NUMERICAL  RESULTS.  The  Generator  B was  run  under  4 separate  conditions 

as  Indicated  In  Table  1.  The  Interpolation  tolerance  for  N 29  Is  10"^ 

and  for  N > 89  It  Is  10'^.  Results  for  Generator  A are  also  Includes  In 
Table  1. 
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TABLE  1.  AVERAGE  RUN  TIMES  (CPU)  IN  SECONDS  WITH  IDENTICAL  INPUT 

OF  10,000  POINTS 


QINERATOR  A 2.427272 

GENERATOR  B 


For  N ■ 29  (with  rational  function  approximation 
at  the  and)  1.307336 

For  N ■ 29  (with  spline  approximation  at  the  end)  1.111576 

For  N ■ 89  (with  rational  function  approximation 
at  the  end)  1.503344 

For  N > 89  (with  spline  approximation  at  the  end)  1.276524 


5.  _ commimmMmmm  needed.  From 

lowing  empirical  inferences  can  be  made. 


Table  1,  the  fol- 


a.  Generator  B with  N - 29  and  with  either  end  option  is  slightly  more 
accurate,  and  about  twice  as  fast  as  Generator  A.  It  requires  186  stored 
constants. 


b.  As  compared  to  Table  2,  Table  3 provides  for  interpolation  over  a 
larger  interval,  is  a little  slower,  provides  six  significant  digit  inter* 
polation  accuracy  but  requires  643  stored  constants. 

Additional  research  could  be  done  in  the  approximations  at  the  end.  (2.2) 
is  not  necessarily  optimal  for  tgj^  <.  t < 1. 

The  constants  for  Generator  8 are  believed  to  be  of  nine  significant  digit 
accuracy.  It  Is  possible  that  they  do  not  have  to  be  this  accurate. 

Further  research  could  address  this  problem. 

Since  computation  for  N « 89  is  only  a little  slower  than  for  N > 29,  but 
interpolates  much  more  accurately,  we  think  more  of  the  CPU  time  is  used  in 
the  interpolation  than  In  the  table  look-up  logic.  As  higher  order  inter- 
polation Is  slower  than  quadratic,  there  is  not  much  advantage  in  using  it 
in  order  to  reduce  the  required  number  of  knots. 
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— In  this  Appindixi  we  outline  the  procedure  for  exti 
over  thi  interval  it2fji  1).  Prlnclpelly,  we  went  to  exhibit  (A. 


(A. 4} I omitting  most  of  the  detail. 
First,  we  went  s so  that,  with 


gfj(t)  ■ P ^ ^ *'(1*2fj)(l'  ** 


■1 


for  t^fj  1 1 < (t2f^  + 1 ) ■ T,  we  have 


/L rwdt 


>jN  I-  -2N 

By  the  chenga  of  variables  t « P(x),  we  have 


/I.  p-'(t)df . xP'Wdx 


■'2N 


p-'(tjN) 


"ir  ^-i!?  V * t*p(-*^2)iix 
P (t-w) 


^2N 


(«xp[-(P'’(tjN))*]  - expC-(P''{T))®]) 


Similarly, 


/^P’^(t)dt  - Vm  r P"''(t)dt  • exp[-(P"'’(T))^3  . 

t Tl-^1  t ^ 


(A. 3),  (A.l),  and  (A. 2}  yield 

‘ ' ,T  *2n)  + ‘2N> 

(t  ■ tosj 


2N^ 


(expt-(p-'(tjN))*]  - exp[-(p-’(t»*]»  . 
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For  T ;<  t < 1,  w«  procMd  slmlUrly  to  daflne 
«H<*)  • Sn®  ♦ «|!|(t(t  - ♦ t(t  - T)*  , 

Y ■ TT^iF  **"*^''  ■ 

With  B and  y so  datsnnlnsdi 

g(t)  « gf|(t)  , tgfj  1 1 < 1 . 


It  should  bo  notod  that  g|^(t)  Is  not»  stHetlyi  an  Intorpolatlon  function. 


iC  SPLINE  COEFFICIEHTS  FOR 


• N S9  <!9  io  ; I 


• l>*  f"  <NJ  «<• 


<si  CM  c4  «^2  N ri  <vi  ri  ^ ^ u;  ixS  ^ h: ^ S ^ 8 ^ : 


«NJ?90>lfi‘<<»)C9Mt; 

‘;§5SK“ 


■ t-cM(NegcMcsuMPjeM«n 


} ^liiiiii^iiiiiiiiii^iiii^^ 

: SsSsRisRassissSSssijiljiisiSsiSilss 


•i= 


•r«  OF»cMfo«^in^rs>eoo^o»-«vstcnNC*L/iior^QO9NQr*^rn^u)^r^cO0)g 

P“  ^ P**  f*  IF*  t"*  t**  F*^  ^ ^ €>4  ^ ^ ^ ^ C^  ^ ^ ^ 
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EIGENVECTORS  ANALYSIS  OF  KMFIRICAL  DATA 
VERSUS  UTILIZATION  OF  STANDARD  FUNCTIONS 


Oakar  M.  Esaanwangar 
Physical  Scianoa^  Dlraehorata 
Taohuology  Laboratory 

US  Army  Mlaalla  Raaaaroh  and  Davalopmant  Comnand 
Radatona  Araanal,  Alabama  35809 


ABSTRACT.  Paramatarlaation  o£  anplrical  data  (a.g.t  tha  wind  pro^ 
fllaa  from  aurfaoa  to  25  Km  altltuda)  in  many  eaaaa  antalla  tha 
approximation  of  data  by  mathamatioal  funotiona.  In  ganaral,  aavaral 
optiona  whioh  load  to  aolutiona  ara  avallabla  but  tha  quaation  of  which 
la  tha  moat  aultabla  form  la  aomatimaa  difficult  to  anavar. 

Of tan  a apaoifio  goal  of  approximating  data  by  mathamatioal  funotiona 
la  tha  darlvation  of  ona  oharaotariatic  paramatar  or  variata.  Thaorati- 
oally,  aiganvaotor  analyaia  (or  aquivalantly  tha  davalopmant  of  amplrioal 
polynomiala)  ahould  load  to  maxiuium  information  by  a aingla  paramatar, 

A oompariaon  batwaan  approximationa  by  aiganvaotora  and  atandard 
(orthogonal)  funotiona  haa  baan  mada.  It  ia  aWn  that  in  particular 
oaaaa  atandard  funotiona  can  aohiava  aquivalant  raduetlona  of  tha 
varianoa  and  thay  may  ba  aimplar  and  mora  aeonemioal  to  oomputa  than 
aiganvaotor  funotiona. 

1.  INIRODUOTION.  Faramatriaation  of  atmoapharia  data  (auoh  aa 
tha  wind  prefila  aa  funotion  of  tha  altltuda)  raquiraa  tha  darlvation 
of  aultabla  mathamatioal  axpraaaiena.  Tha  availability  of  high  apaad 
alaotronlo  data  prooaaaing  toola  haa  opanad  tha  door  to  a utiliaation 
of  tha  moat  aophiatioatad  mathamatioal  toola  avan  for  tha  ganarally 
hugo  oollaetivaa  of  atmoapharia  data.  For  oxampltt  tha  oaloulation  of 
amplrioal  polynomiala  (or  aiganvaotora  in  mathamatioal  tarmlnology)  la 
now  poaaibla  without  too  much  diffloulty  for  tha  larga  dimanaiona  of 
atmoapharia  data  matrioaa.  Conaaquantly  it  ia  vary  tampting  to  "grind" 
huga  data  oollaotiona  through  tha  oomputara  without  oonaidaring  how 
muoh  banafit  thaaa  highly  aophiatioatad  toola  randar  ootnparad  with  tha 
application  of  atandard  funotiona  or  aimpla  paramatara. 

In  thia  artielai  aoma  light  ia  ahad  on  tha  utiliaation  of  amplrioal 
polynomiala  in  oompariaon  with  tha  uaa  of  atandard  funotiona  axan^lifiad 
by  tha  wind  profilaa  of  oartain  altltuda  rangaa.  Undar  curtain  oondi> 
tionsi  atandard  funotiona  oan  aohiava  an  aquivalant  roduotion  of  tha 
varianoa  to  tha  ona  obtained  by  aiganvaotor  analyaia. 

2..  THE  flALQULATICJ«  QlLEiaBNVgOTQM.  Tha  problem  undar  oonaidara- 
tion  la  tha  davalopmant  of  proper  funotiona  for  tha  wind  apaad  profile 
V|^  where  tha  h ia  a aubaoript  denoting  tha  altitude.  daaignataa  a 

mean  wind  apaad  profile.  The  wind  direotion  oan  ba  treated  oquiva* 

lantly.  Wa  formulate  tha  ropraaantatlon  of  tha  wind  apaad  profile! 
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(1) 


®1,1  “’l.h  ®2,l  ‘**2,h 


njl 


n,h 


whare  1 ■ N,  and  n « H.  Xn  this  equation  the  coefflolante  B 

and  the  lunotiona  4.  . muat  ba  determined. 


J.i 


The  development  of  optimleed  oharaoteristio  functions  is  a 
known  problem  of  matrix  analysis.  A mathematloal  formulation  1st 


(2) 


where  designates  a matrix  of  eigenvectors  (or  polynomials),  the 
data  matrix  for  the  wind  profile,  and  a (diagonal)  matrix  of  eigen- 
values. The  elements  of  the  (synnetrie)  data  matrix  are  either  the 
cover lances : 


\.k- 


Zv 


h.i 


(3a) 


the  standardised  covariances: 


(3») 


or  the  correlations: 

A Judgement  of  the  effectiveness  of  the  systems  can  be  made  by  a calcu- 
lation of  the  residual  or  left  variance,  or  the  percentage  reduction, 
which  can  be  readily  obtained  from  the  eigenvalues  \ by: 

PR.  . (4) 

J J j-i  J 

More  details  on  the  mathematical  background  can  be  found  in  the  author's 
text  (1)76).  The  covariance  and  the  eorralation  system  has  been  compared 
in  a r'‘''ent  article  by  Bosenwangar  (1975),  and  will  not  be  repeated  here. 
In  this  article  it  is  illustrated  that  the  percentage  reduction  varies 
largely  with  the  particular  system  which  is  selected  but  the  residual 
variance  (error)  is  of  the  same  magnitude  for  the  same  number  of  terms 
irrespective  of  the  percentage  reduction  of  the  individual  system. 
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3.  EIGENVECTORS  OF  THE  WIND  PROFILE.  First  it  should  be  clarified 
that  under  the  term  *'wind  profile"  the  structure  of  the  wind  velocity 
in  the  first  10  m of  the  atmosphere  Is  not  meant.  The  nomenclature 
deslgtiAtes  the  wind  speed  or  direction  as  a function  u£  the  altitude  up 
to  about  25  or  30  iOn. 

The  first  eigenvectors  of  the  wind  direction  covariance  matrix  for 
the  altitude  range  surface  to  24  Km  are  depicted  in  Figure  1 for  January 
and  July  at  stations  representative  of  four  climatic  zones.  We  learn 
from  Inspecting  Figure  1 that  It  would  be  very  difficult  to  find  an 
adequate  standard  function  to  approximate  that  particular  structure  of 
the  atmospherle  direction  profile. 

In  turn»  as  displayed  In  Figure  2,  the  first  eigenvector  of  the 
wind  speed  from  surface  to  10  Km  altitude  range  lends  Itself  readily 
for  replacement  by  a standard  function.  A linear  curve  fit  would  ade- 
quately replace  the  eigenvector  for  three  stations,  and  the  fitting  of 
a second  order  curve  may  be  a successful  approximation  for  Albrook. 

The  examination  of  the  eigenvectors  for  the  surface  to  24  Km  wind 
speed  system  follows  next.  Figure  3 discloses  that  at  least  for  some 
climatic  regimes  a standard  function  such  as  the  Fourier  series  may  be 
applicable.  This  fact  Is  supported  by  scrutinizing  Figure  4 which 
exhibits  the  wind  speed  profile  for  Montgomery.  As  it  is  displayed,  the 
major  eigenvector  comprises  over  80%  of  the  variance  and  resembles  a 
sine  wave.  Indeed,  a Fourier  analysis  of  the  first  three  eigenvectors 
revealed  that  at  least  the  first  two  eigenvectors  provide  largely  one 
dominant  Fourier  term.  A comparison  of  the  eigenvector  and  Fourier 
system  appears  to  be  a worthwhile  study. 
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■Q.2  0 0.2  0.4  0.6  0,8  1.0  (units)  -0.2  0 0.2  0.4  0,6  0.8  1,0  (units) 

Figuv«  4,  First  Thrss  Eigenvectors  (Seeled) 

Montgomery,  Sur£ece  • 24  Km,  y " 


4.  STANDARD  FUNCTIONS  FOR  THE  WIMP  PROFILE.  While  empirical 
polymonlals  provide  an  optli^  oi  information  In  one  single  term,  stand- 
ard functions  have  ether  advantages,  One  of  them  Is  the  homogeneous 
mathamatlcal  background  for  different  collectives,  e.g.,  data  from 
dlffarent  climatic  regimes.  This  homogeneity  Is  beneficial  for  a classi- 
fication of  the  wind  profile  Into  categories  (sea  Essenwanger,  1974). 

The  differences  of  the  percentage  reductions  between  Individual  order 
terms  at  locations  from  typical  climatic  regimes  are  not  partially  or 
entirely  caused  by  the  diversity  of  this  mathematical  background. 

Because  the  present  goal  Is  the  derivation  of  one  characteristic  param- 
eter, the  homogeneity  of  the  background  Is  of  secondary  Importance 
here.  Of  Interest,  however.  Is  the  simplicity  or  the  cost  savings 
associated  with  the  utilisation  of  standard  functions. 
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Table  1.  Left  Variance  of  Vind  Profile  (Surface  to  25  1^} 
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V ■ 'h  ■ *0.1  '"*1,1  •*"  '“h  ■"  ^,l>'"*2,l“” 

(5*) 

''h.i  ■ “o.i  + <=i,t  •*“  + 14, t>  + “a  n 

(5b) 

«nd  the  «lg«nv«ctor  lysttm  of  ociuAtlon  (3b)  which  had  amarged  aa  tha 
ayatam  with  tha  amallaat  raaldual  varlanoa  of  tha  thraa  alganvaetor  aya- 
tama  In  a aaparata  atudy. 

Wa  laarn  that  ona  term  of  tha  alganvaetor  ayatam  with  coafflolant 
^ dlapLaya  the  lowaat  left  varlanoa.  It  ahould  ba  notload  that  Aq 

or  Cq  la  the  flrat  coafflolant  of  the  Fourier  ayatamt  which  leada  to; 

"5  ■ ^ ^ Wh.l  ■ ’h  ■ *0, !>*''<''  • ■»  <“> 

or; 

0^  - Zl  I (V.  . - C.  ,)^/(h  • N)  . (6b) 

Conaaquantly  tha  column  for  ona  term  of  the  alganvaetor  ayatam  muat  ba 
compared  with  tha  oolumna  and  o^.  Attention  ahould  ba  called  that 
an  aaaumptlon; 

''h.l  ■ 'h  • *0.1  »1> 

laada  to  a raaldual  varlanca  which  la  qulta  comparable  with  tha  algan* 
vector  ayatam.  Although  tha  ayatam  raqulraa  that  tha  mean  wind  apead 

profile  la  known,  the  praraqulalta  la  Identical,  however,  with  tha 

ona  In  tha  alganvaetor  ayatam.  Xt  la  aalf  evldant  that  the  calculation 
of  the  average  value  Aq  ^ la  a trivial  taak. 

A further  reduction  of  tha  variance  la  gained  by  adding  tarma  In 
tha  Fourier  or  alganvaetor  aerlea.  Baoauae  ona  term  of  tha  Fourier 
ayatam  haa  two  paramatara  which  can  ba  fitted  tha  oolumna  ahould  not 
ba  compared  equivalently  according  to  thalr  haadlnga.  Tha  left  varlanoa 
ahould  ba  ooiqparad  between  one  term  of  tha  Fourier  aarlaa  and  three 
tarma  of  tha  alganvaetor  ayatam.  Than  tha  fact  that  the  left  varlanoa 
la  lowaat  for  tha  empirical  polynomlala  agreea  with  tha  expectation. 

One  additional  fact  daaarvaa  attention.  If  wo  ara  Intaraatad  In 
a alngla-varldta  ayatam,  tha  eigenvector  ayatam  can  only  ba  baaed  on 
B.  , beoauaa  tha  other  ooeffiolanta  B,  J K 2,  are  Independent  of 

1 |L  j I li 
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A.lthough  the  Fourier  eysteim  Is  orthogonal  the  coefficients  can 
be  related  (eee  Bseenwenger,  l964>. 

3,.  BARAMEfBR  FQHRIBR  SYSTEM.  Before  a single-parameter 

system  other  than  based  on  Ag  tan  be  exemined  let  us  derive  an  ana- 

lytlaal  expression  for  the  replaoement  of  the  current  ooefflolents  of 
the  Fourier  system  by  an  approxlmatlen  . We  easti 

’k.t  ■ \ ■ (*0.l  + * (*l.i  + + I>1,1 

+ APj^i)  + ...  (7) 


By  summation  over  h and  omission  of  the  terms  which  become  aero  we 
deduce  thd  following  expression  for  the  left  varlancei 

V.rJ  - 8$  - 4 + - Aj/J  + Ai(Aj  + AH,)  ♦ ,J^/2 

- Aj/2  + ^(l  - cos  Ap^)  + e*  /2  + . . . (8) 

(The  subscript  i denoting  the  Individual  observation  time  has  been 
omitted) . 

It  Is  easily  recognised  that  for  c ■ 0 and  AP  ■ 0 Bqn.  (8)  reduces 
to  the  well-known  formula  for  the  left  variance  (e.g.  see  Essenwangsr, 

1976)  because  the  two  terms  after  A^/2  disappear.  It  may  be  reasonable 
2 2 ' 

that  A,  > e.  for  the  dominant  Fourier  term.  For  the  other  term  of  the 
J Aj 

series  it  may  not  hold,  and  instead  of  a net  decrease  of  the  variancs, 
an  increase  may  result. 

A critical  contribution  to  the  error  variance  Is  also  made  by  AP. 

Xt  la  obvious  that  for  |ap|  > n/2  the  cosine  term  becomes  negative i and 
thus  the  error  contribution  of  this  term  may  become  quite  significant 
unless  ths  amplitude  is  small.  Inspection  of  Figure  5 reveals  that  P^ 

for  the  system  (5b)  displays  a distinct  maximum  for  Its  frequency  dis- 
tribution, and  a replacement  of  the  individual  Pj^  ^ by  Its  mean  may 
suffice.  However,  Pj^  for  the  system  (5a)  exhibits  a blaodal  distribu- 
tion (Figure  6).  Consequently  ws  must  find  a oharacterlatlo  parameter 
which  provides  a close  approximation  of  P^,  Ag  and  Aj^.  Ihe  Investigation 

Is  still  in  progress  but  tentative  results  Indicate  that  choosing  a 
single  characteristic  such  ast 
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(9a) 


l.i 


ors 


Z w,  V, 


k k,l 


<9b) 


nay  sueeaad.  Than  |9Cy)i  A^Cy)  and  Aj^(y)  aCQ.  k and  r danota  cactaln 

alcltuda  levalt,  and  (/)  stands  for  appropriate  weights.  Tentative  results 
are  deplotad  in  Table  2 which  had  been  obtained  under  favorable  condi- 
tions. We  learn  that  the  surface  to  23  Km  single  parameter  eyetem  would 
be  competitive  with  the  eigenvector  system.  It  should  be  coneidered 
that  an  increase  of  the  variance  of  25%  is  not  significant  at  the  95% 
level  of  confidence  for  the  F-tast  for  W ~ 200. 


It  is  emphasised  that  the  replacement  by  standard  functions  cannot 
be  generalised  for  the  wind  profile  from  all  altitude  ranges.  For 
example,  if  our  goal  is  the  derivation  of  a single  charaoteristio  for 
the  surface  to  15  Km  range,  probably  the  eigenvector  system  is  the  best 
approach.  The  possibilities  of  a replacement  by  standard  functions  must 
be  examined  in  every  individual  case. 

6.  CONCLUSIONS.  A comparison  was  made  between  curve  fitting 
systems  based  on  empirical  polynomials  (i.a.  eigenvectors)  and  standard 
functions.  It  was  disclosed  that  the  eigenvector  system  offers  an 
optimum  reduotion  of  the  variance  with  a minimum  number  of  coefficients 
as  expected  from  theory.  It  was  illustrated,  however,  that  under  certain 
conditions  standard  functions  may  perform  quite  well,  and  these  are 
simpler  and  more  economical  to  compute  than  eigenvector  functions. 

7.  REFERENCES  CITED. 

Essenwanger,  Oskar  M. , 1964,  Mathematical  Characteristics  of 
Individual  Wind  Profiles!  Report  RR-TR-64-12,  US  Army  Missile  Conmand, 
pp.  41, 

Essenwanger,  Oskar  H. , 1974,  The  Structure  of  the  Wind  Profile 
From  Surface  to  25  Km  in  Various  Climatic  Zones | Klimatologiseha 
Forschung  (The  Hermann  Flohn  60th  Anniversary  Vol);  Ferd,  D&mmlar, 

Bonn,  p.  523-539. 

Essenwanger,  Oskar  M.,  1975,  Eigenvector  Representation  of  Wind 
Profiles,  Prej^rints,  Fourth  Conference  on  Probability  and  Statistics  in 
Atmospheric  Sciences,  p,  206-210|  (Publ.  American  Meteorological  Soo., 
Boston) . 

Essenwanger,  Oskar  M. , 1976,  Applied  Statistios  in  Atmospheric 
Sciences,  Vol  A.i  Elsevier,  Amsterdam-New  York,  pp,  412. 

B.  ACKNOWLEDGMENT.  The  author  wishes  to  express  hie  gratitude 
to  Dr.  D.  A.  Stewartfor  her  critical  review  of  the  manuscript. 


174 


INDUCTION  ON  A MARKOV  CHAIN 
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ABSTRACT ■ Through  tha  usa  of  Markov  chain  methods,  cxpcasolons  for  Mean 
RouncTs  Batwaan  Fallura  (MRBF)  wars  found  for  a class  of  weapon  systams.  Tha 
nathod  lad  eo  an  Inductive  datarmlnatlon  of  an  axprasalon  for  the  ganeral  case. 

Following  eha  derivation  of  the  ganaral  MRBF  expression,  expressions  for 
rallablllty  ara  obtalnsd  (but  not  a ganaral  expression) . 

1,  INTRODUCTION.  Tha  problems  treated  in  this  paper  relate  to  s shlp- 
boaro  weapon  system  of  th*  following  type.  Some  number  (a  variable)  of  gun 
mounts  are  connected  In  piirallal.  This  parallel  network  is  then  connected  in 
aariaa  with  a fire  control  syetam.  Each  gun  mount  has  the  same  number  of  guns 
(for  simplicity,  we  will  assume  one  gun  par  mount)  the  results  are  easily  ex- 
tended to  some  other  number  of  guns  per  mount). 

■ Prob  (given  mount  functions  succsssfully)  - (1) 

42  * Prob  (firs  control  functions  succesafully)  (2) 

Pi  1 - i . (3) 

Once  a mount  falls,  It  is  considerad  inoperative  thereafter. 

Nota  that  ws  era  assuming  that  each  gun  mount  has  the  same  success  prob- 
ability. This  assumption  almpliflss  tha  Markov  chain  work  somewhat,  but, 
as  wa  will  show  later,  even  this  simplifying  assumption  doesn't  serve  much 
purpose  In  tha  and. 

In  this  particular  application,  tha  interest  was  focused  only  on  tha  be- 
havior of  the  gun  mounts  and  firs  control.  We  are  therefore  not  concerned 
with  failures  of  other  parts  of  tha  system,  such  as  the  guns  or  the  ammuni- 
tion, and  will,  for  com'eniencs,  aaeume  that  these  function  perfectly. 

2.  MEAN  ROW^PS  BETWEEN  FAILURE  (MRBF) . In  this  application,  MRBF  will 
be  defined  as  the  expected  itumber  of  rounds,  successful  and  unsuccsssful , 
attempted  up  to  and  including  the  first  salvo  where  either  none  of  the 
mounts  function,  the  fire  control  does  not  function,  or  both  events  occur. 


For  ona  mount,  it  ii  apparent  that  MRBF  follows  a geometrlr,  distri- 
bution. The  probability  of  a salvo  successfully  occurring  is 
the  properties  of  the  geometric  distribution,  then 


MRBF  « 1/(1  - <l3_4g) 


w 


For  two  mounts,  a Markov  chain  was  constructed  with  the  following 
state  definitions t 

80  ■ evarythlng  working 

81  ■ one  mount  out,  fire  control  working 

82  ■■  system  not  working 

The  transition  matrix  was  as  follows) 


80 

SI 

32 

SO 

api^idg 

1 - £ left 

elements 

SI 

— 

^1^2 

1 - 1 left 

elements 

ss 

1 

In  the  above  matrix,  the  expression  "left  elements"  refers  to  matrix 
elaments  in  the  same  row  but  in  columns  to  the  left.  One  would  ordinarily,  and 
correctly,  think  that  in  row  82  the  one  should  be  in  column  82  rather  than 
SO,  thereby  reflecting  the  fact  that  state  62  is  an  absorbing  state.  This 
one  is  shifted  to  SO,  however,  to  change  the  problem  into  one  that  can  be 
treated  as  a first  passage  situation. 


We  will  use  column  82  to  bring  about  degeneracy,  so  we  are  not  concerned 
about  what  the  actual  valuae  in  this  column  turn  out  to  be.  Solving  therefore, 
for  the  steady  state  probabllltlas  in  terms  of  the  steady  state  probability 
for  stats  82  (denoted  P(S2)),  we  have 


• rrfer  P(S2) 


<3) 


178 


-1  -a ' ’ 'X  '8 ' 


P(S1)  ■ P(82) 


MHSF  - 


For  three  mount! » the  transition  matrix  becomes  more  compliaatsdi  so 
the  simplified  Markov  chain  method  (l|S]  was  used.  The  states  were  defined 
in  terms  of  situations,  rather  than  on  a salvo-by-salvo  basis.  The  states 

were 

SO  ■ system  working 

81  ■ 1 mount  failed  in  salvo  of  first  failure 
B2  ■ 2 mounts  failed  in  salvo  of  first  failure 
S3  " system  not  working 
The  transition  matrix  becomes i 


3Pj^  3^ 


£ left  elements 


Ool^i  through  the  etepa  required  for  aolutlotti  aa  described  In  l|2  , 

we  obtain 


[vjrbF  ■ 


Using  the  simplified  Markov  chain  method  for  4tS  and  6 mounts i the  pattern 


flRBP,  » 


1 " ) 


continued,  where  1 is  the  number  of  mounts. 


Since  only  the  top  row  of  the  transitional  matrix  in  the  simpliflad 
form  has  any  new  information  as  the  number  of  mounts  increass,  and  since 
the  expected  length  of  the  various  states  (in  the  simplified  Markov  chain 
sense)  was  determined  as  the  number  of  mounts  increased , induction  was 
eoniiderad. 


By  considering  the  result  true  for  k>l  «nd  considering  what  the 
scruoture  of  the  top  row  of  the  transition  matrix  would  bo  for  k » 
it  was  seen  that 


MRBF  ■ 


U-q^^qgKl-q^cig) 


1 


problmii  B«causa  of  th«  nature  of  the  simplified  Markov  chain  transition 
matrix  for  this  problc."  (whersby  ell  of  the  Information  of  interest  appear- 
ed along  the  top  row)  i : was  seen  that  a direct  algebraic  Induction  solutiont 
without  any  use  of  Markov  chains  at  all,  was  possible. 


. 1 
\ 


Finally,  it  was  seen  that  the  problem  was  actually  much  simpler;  even  alge- 
braic induction  was  not  necessary.  This  was  determined  as  follows.  Let  us 
imagine  that  an  observer  is  stationed  by  each  mount,  and  that  each  observer 
will  remain  by  his  mount  for  an  infinite  length  of  time  (or  for  an  infinite 
number  of  trials  from  the  situation  "everything  working"  to  "system  down") . 
Each  observer  records  how  many  rounds  are  fired  from  his  mount  until  his 
system  (the  fire  control  and  his  mount)  breaks  down.  His  system  is  equival- 
ent to  a one  mount  system,  as  is  each  of  the  other  observers,  so,  over  the 
long  run,  the  average  number  of  rounds  between  failure  for  his  system  will 
be  the  same  as  the  HRBF  for  one  mount.  For  several  mounts,  than,  the  MRBF 
for  the  system  is  just  equal  to  the  sum  of  the  MRBF's  for  individual  mounts 
(whereby  our  earlier  simplifying  assumption  that  all  mounts  have  the  same 
MRBF  is  seen  to  be  unnecessary) . VIhlle  the  common  fire  control  suggests  de- 
pendency, the  dependency  exists  only  for  each  trial  from  "everything  working" 
to  "system  down";  it  does  not  exist  for  the  system  MRBF. 


From  the  above,  it  is  seen  that  almost  no  mathematics  was  necessary  for 
solution.  At  the  some  time,  the  mathematics  bears  out  the  result  obtained 
through  the  purely  intuitive  approach  Just  described. 

2,  RELIABILITY . For  this  application,  the  reliability  for  an  K round 
mission  will  be  defined  as  the  probability  that  a mission  of  N successful 
rounds  will  be  accomplished. 

For  one  mount,  we  have  a simple  geometric  distribution,  and  the  N round 
reliability  can  be  expressed  as 


(12) 


For  two  mounts,  let  be  the  number  of  sslvos  that  would  be  required 
if  the  Nth  successful  round  were  fired  in  the  salvo  and  no  breakdowns 

occurred  in  the  first  k -1  salvos.  If  N Is  even,  k ■ N/2.  If  N is  odd, 

ko  ■ (N+D/2.  ° 


a 


181 


After  some  Investigation  into  how  the  problem  could  best  be  alge- 
braically treated,  it  was  found  that  the  best  approach  would  be  one  where- 
by any  necessary-  summations  would  be  indexed  by  the  number  of  successful 
fire  control  salvos.  Thus,  for  two  mounts,  N even,  we  have 


n ^0  u.  r o N ^ 

Which,  for  q-  < 1,  is  found  to  b.n 

^ k„+l  N+1 


For  two  mounts,  N odd,  we  have 


„ N+1  ko  ^ „ N ko  ^ ^ N : 

- q^  q^o  + 2p^q3_q2  + 1 2p^q^q, 


k-k^+l 


which,  for  q,^  < 1,  is  found  to  be 


qf^f  + 2p^q«q'‘0  t 2p^q! 


k.+l  N+1 

N ^2  “ 

1^1  X “ q. 


For  three  mounts,  the  problem  becomes  slightly  more  complicated.  Let  us 
make  the  following  definitions  for  k^  . 

If 


N » 0 module  3,  k^  ■ M/3 


N » 1 modulo  3,  k„  = (K+2)/3 


N ■ 2 modulo  3,  “ (N+l)/3 


The  following  probability  of  mutually  exclusive  events  are  defined. 


1.  P(0)  ia  the  probability  that  the  M'th  succeesful  round  occurs  on 


the  kgth  salvo. 


2.  P(OA)  ia  the  probability  that  the  N'th  eucoesaful  round  occurs  after 
the  k^th  salvo,  but  no  mount  failures  occur  in  the  first  (k^-1)  salvos. 

3.  P(l)  is  the  probability  that  one  mount  failure  occurs  in  the  first 
(kg-1)  salvos,  no  more  failures  occur,  and  the  N'th  successful  round  occurs 
after  the  k^th  salvo. 


4.  P(2)  is  the  probability  that  two  mount  failures  occur • at  least  one 
befoire  the  k^th  salvot  and  the  N'th  sudoeasful  round  occurs  after  T/L^th 
salvo . 

Then 


Rjj  « P(0)  + P(0A)  + P(l)  + P(2) 

When  N 0 modulo  3 

P(0)  - . 

When  N ■ 1 modulo  3 

HO)  . - p3)  . 

When  N ■ 2 modulo  3 

vtn')  . o^-2jN+l)/3>  3 . .iv 
P(0/  ■ q,g  \ + 3p^Q.^J  • 

When  N " 0 modulo  3 


p(o*) . 

+ (3p^«l^q2)q^q^} 


N-1  (N+3)/2, 


- 3p,qpq^ (1  - p;  + P.q^qg) 


(20 


(21 


(22 


(23 


(24 


Wlien  N ■ 1 modulo  3 


Whtn  N ■ 2 modulo  3 


P(0A) 


N-2  (N-2)/3 


[{3p^<l^<l2)q^<l23 


JN+4)/3„2 
^^1^2  Pi 


(26) 


Then 


Let  S(N/2)  be  the  smallest  interger  larger  than  or  eiiual  to  N/2  . 


S(^2)  N-2k  2k  k S(N/2)  N-2k+l  2k  k 

P(l)  » I 3p,q,  'll  Ip  + £ 3p,q,  q.  q,.  (2?) 

k-ko+1  ^ ^ ^ ^ H-k.+l 

kSK/2  k*(R+l)/2 


If  kQ-t-1  ■ N/2  the  above  generalizes  to 

k„+l  a(N/2)+l 


, / N ^ N+lv  ^2°  '*2 

3Pi(q3_  + <ii  ) 1 • ^ 


(28) 


for  qg  < 1.  If  (kg+l)  > N/2,  P(l)  does  not  exist  (the  case  for  N small). 

K 

P(2)  - E Sp^C.q^qg  (29) 

k=ko+l 

where  C,  indicates  the  number  of  ways  two  numbers  can  add  up  to  N-k 
given  tne  maximun  of  these  two  numbers  is  less  than  k. 

For  S(N/2)  a k » N 

. N - k + 1 , (30) 


For  ko+1  S k - 3(N/2)  - 1 

■ (-N  + 3k  - 1) 


(31) 
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Then 


r 3p?(N  - k + DqW 
k*S(N/g) 


3p?q?  { I (N  - k - 1 + 2)qJ  } 

^ ^ S(N/2)  ^ 

2 N ® k ^ k 

3pfd?  ( r (N  + 2)C  - E (k  + l)q5  } 

^ ^ S(N/2)  ^ S(N/2)  ^ 


Treating  the  right  hand  term  within  the  brackets  as  the  sum  of  derivatives 
(being  equal  to  a derivative  of  a sum)  the  above  becomes,  for  < 1 

S(N/2)  N+1 


3pjq"  {(N  + 2)  •■•“Y” 


N S(N/2)-l  . 

[ E (k  + l)q;  - E (k  + l)q5  J } 
0 0 


.S(N/2)  „N+1 

. 3P=,^  <(»  * .) 


1 - (N  + 2)q2'^^+  (N  + l)q2'*’^ 


U - 

1 - (S(N/2)  . . S(N/2)q|^^/^^^^ 

(1  - 
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In  SI  similar  mwner,  we  find 


S(N/2)-l  „ 

E ( - N + 3k  - DSpJqJa* 

kg+1 

- 3p®q”(  - (N  + 4)  -£-y 


„S(N/2) 

h 


“la 


1 - (8(N/2)  + + S(N/2)q!^”^®^^^ 


1 - (kp  * 2)q^o'*’^  * (kp  l)qgO*^ 
U - ij)* 
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MARKOV  AND  PATH  DEPENDENT  PROCESSES 
APPLIED  TO  CONTINUOUS  SAMPLING 
PLANS  IN  TANDEM 
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China  Lake,  California 


ABSTRACT . A continuous  sampling  scheme,  consisting  of  two  generic 
Continuous  Sampling  Flans  (CSF)  in  series,  is  analysed.  This  serial 
arrangement  is  used  for  the  attribute  sampling  for  two  different  Indepen- 
dent characteristics  of  items  in  a given  production  run(  the  output  from 
the  first  plan  forme  the  input  to  the  second.  Using  standard  one  dimen- 
sional Markov  Chain  (HC)  models  for  the  generic  CSP's,  the  aerial  CSP 
model  is  shown  to  be  equivalent  to  a two  dimensional  (or  second  order)  MC 
wherein  the  state  of  the  second  component  is  directly  dependent  on  that  of 
the  first. 

The  ergodic  properties  of  the  marginal  distribution  of  the  second 
component  are  analysed  by  ualng  1)  the  ergodic  theorem  applied  to  matrix 
valued  random  variables,  2)  a nonstatlonary  MC  approximation  to  a path 
dependent  process,  and  3)  direct  producta  of  transition  matrices  constrained 
by  the  dependence  mentioned  above.  In  the  latter  two  approches,  the  MC'e 
are  shown  to  be  aperiodic  and  (strongly)  ergodic;  either  one  can  ba  used 
to  show  convergence  of  the  path  dependent  process.  Taking  the  appropriate 
limits,  as  the  production  run  becomes  infinite,  it  is  proven  that  the 
limiting  probabilities  for  the  second  component  are  independent  of  those 
pf  the  first. 

Using  direct  products,  the  analysis  Is  extended  to  the  daae  of  three 
or  more  CSF's  in  tandem.  Under  the  additional  assumption  of  a separable 
initial  probability  vector  and  for  n a 2,  the  direct  product  MC,  which  is 
ergodic  and  stationary,  Is  shown  to  be  equivalent  to  a finite  sequence  of 
n MC's.  In  this  sequence,  the  first  MC  Is  ergodic  and  stationery;  the  re- 
maining MC's  are  (strot\gly)  ergodic  and  nonetatlonary . Comparisons  are 
also  made  with  other  naturally  arising  multicharacteristic  sampling  plans. 


1.0  INTRODUCTION. 

1.1  Continuous  Sampling  Plans.  Given  e production  line  of  items,  a (one 
charactsristic)  Continuous  Sampling  Flan  (CSP)  conaista  of  two  or  more 
phases  of  attribute  sampling  for  an  Item  charactsristic  directly  from 
the  line.  In  et  least  one  phase,  the  sampling  frequency  is  eero  with  an 
exit  occurring  only  aftur  a fixed  number  of  Items  are  found  to  be  conse- 
cutively nondf>factlvQ  (screening  pha.Bo) . The  phases  ara  always  connected 
is  such  a way  that  each  of  thorn  is  "positive  recurrent"  for  an  (abstract) 
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infinite  production  run.  Moreover,  the  number  of  phaaea  is  finite  and 
an  exit  from  any  one  of  them  takes  place  after  a finite  number  of  produc- 
tion unite  with  probability  one. 


i: 


ivJ.. 


CSP’a  are  modelled  by  Markov  Chains  (MC)  which,  because  of  the  phase 
structuring,  are  finite,  aperiodic,  and* irreducible,  The  plana  and  their 
MC  models  are  discussed  at  length  in  References  6.2  and  6.5.  The  simplest 
of  the  CSP's,  CSF-1,  along  with  its  usual  MC  model,  is  described  in 
Chapter  2. 

1.2  Origin  of  Tandem  CSP'a.  In  the  paet,  CSP-1  has  been  used  in  a aerial 
manner  to  sample  for  elghT  different  characteristics  per  production  unit, 
tn  practice,  the  cheracteristica  were  sampled  for  at  successive  stetions 
along  the  production  line.  It  la  this  type  of  sampling  chat  is  generalized 
and  modalled  in  Chapter  2 and  further  analyzed  in  the  succeeding  cbnptara. 


1.3  Contents  of  Paper.  In  Chapter  2,  after  describing  CSP-1  and  its  MC 
model.  Semi  Markov  Chains  (SMC)  are  introduced  and  utilized  to  simplify 
the  MC  model  in  two  ways:  the  "classical"  way,  driven  by  a particular 

functional,  and  a second  way,  motivated  by  the  serial  sampling  plan  and 
Che  idea  of  a controlled  Markov  Chain  (MC) . Such  a SMC  simplification  of 
a MC  is  callad  SMC  reduction  (see  Reference  6.2).,  The  description  of 
(2)-serial  CSP-1  ls”then  given  followed  by  a second  order  MC  ((2)-MC) 
model  for  it.  The  (2)-MC  model  is  based  on  the  assumption  of  independent 
characteristics. 

In  Chapter  3,  the  second  SMC  reduction  is  used  in  developing  two 
similar  approaches  to  the  simplification  of  the  (2)-MC  model.  The  major 
connactions  between  the  resulting  models  are  also  brought  out.  The  second, 
path  dependent  model  is  approximated  by  a strongly  orgodic  nonstationary 
MC.  In  Reference  6.2,  it  is  arreneouoly  stated  that  this  approximation 
is  equivalent  to  the  (2)-MG.  Thus,  one  of  the  major  purposes  of  Chapter  3 
is  to  clarify  the  assumptions  made  which  make  the  nonstatlonery  MC  differ 
from  the  (2)-MC. 
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In  Chapter  4,  the  lonpest  of  the  chapters,  a third  method  is  given  | 

which  utilizes  the  concept  of  the  direct  product  of  matrices.  For  n>2,  I 

(n)-set:al  CSP-1  is  also  handled  by  the  same  techniques  and  the  CSP-1  J 

restriction  Is  eventually  dropped.  For  (n)-serlnl  CSP-1,  it  is  also  | 

thovm  that  Ita  direct  product  MC,  which  is  stationary  and  ergodlc,  can  be  ’ | 

separated  into  n MC's.  The  first  of  these  MC's  is  also  stationary  (and  | 

ergodlc)  In  contrast  to  the  remaining  ones  which  are  nonstationary  Und  | 

strongly  ergodic).  Furthermore,  the  latter  n-1  nonstationary  MC's  exhibit  | 

structures  which  are  essentially  different  from  the  one  exhibited  by  the  | 

nonstatlonary  MC  in  Cliapter  3.  Of  primary  Interest  is  the  marginal  Average 
Fraction  Inspected  (AFl)  functional  for  the  last  plan  In  tandem.  This 
functional  Is  compared  to  the  one  which  results  from  use  of  the  plan  by 


Chapter  5 concludes  the  paper  by  summing  up  the  major  conclusions 
and  theorems  as  well  as  stiggs^clng  some  lurcher  possibilities  for  and 
modifications  of  multlcharacterlstlc  sampling  plana. 

1.4  Glcsaary.  In  References  6.1,  6.2,  and  6.3,  the  clearance  number,  which 
characterizes  the  scrconlng  phase  of  CSP-1,  Is  denoted  by  the  capital 
litter  1.  However,  in  this  piper,  "I"  might  be  confused  with  the  Identity 
matrix  and  thus  email  i will  be  used  Instead  for  the  clearance  number. 

Henceforth,  references  will  be  denoted  by  numbers  in  brackets  (e.g., 
"References  6.2  and  6.3"  will  be  written  ae  [6. 2, 6 .3].  Common  abbravia- 
tlons  and  notations  are  given  below. 

rxm  ■ T columns  and  m rows 
[a.e.]  “ almost  everywhere 

pv  ■ probability  vector  (non-negative  entries  with  sum  ••  1) 
CSF  " Contlnuoufi  Ssmpling  Plan 
F1(N)  Fraction  Inspected  out  of  H units 
AFl(N)  - Average  of  F1(N) 

AFInC")  ■ Merglnal  AP1(«)  for  the  nth  plan  In  a (n) -aerial  CSP 
MC  « Markov  Chain j SMC  ■ Semi  Markov  Chain 
M(.)  ■ MC  proceaes  X(*)  ■ SMC  process 

A (Z)  B * Dli^sct  product  of  the  two  matrices 
Ajn  “ Trensitlon  matrix  of  (n)-serlal  CSP-1 


1.3  Acknowledgmsnt.  Mrs.  Leah  K.  Jonas  deserves  full  credit  for  the 
excellent  end  expeditious  typing  of  the  paper  as  well  as  for  the  drafting 
of  aomc  complicated  dlegrame  and  the  proper  rendering  of  special  technical 
symbols. 


2.0  BACKGROUND. 

2.1  CSP-1.  This  sampling  plan,  the  simplest  of  its  type,  is  characterised 
by  one  variable  end  two  parameters.  The  variable,  p,  ie  the  probability 
of  finding  a defective  item  (characteristic)  under  the  eeeumption  that  the 
product  flow  forms  a Dernoulll  process.  The  two  parameters  are  1,  the 
clearance  number  required  to  exit  from  the  screening  phase  (abbr.  sc),  and 
f,  the  sampling  frequency  to  use  during  the  unlimited  sampling  phase  Ubbr. 
uls).  Thus,  when  neceaeary  for  clarity,  s particular  CSP-1  will  be  written 
explicitly  as  CSP-lfp;  i,f].  The  black  box  description  of  and  the  MC  model 
for  the  plan  appear  in  Figures  1 and  2,  respectively. 


Figure  1 

Block  Diagram  of  CSP-l(p;  i,f] 


.aMlcrii 


First  box  ■ screening  phase  (sc) 

Second  box  “ unlimited  sampling  phase  (uls) 


Figure  2 

Markov  Chain  Model  of  CSP-l[p{  l,f] 


p ■ Probability  of  defective;  q ■ 1-p 
1 ■ Clearance  number 
f " Sampling  frequency;  v ■ 1-f 
HJ  “ MC  state  of  sc,  0 a J S 1~1 
SN  “ Noninspection  MC  state  of  uls 
SI  " Inspection  MC  state  of  uls 


2.2  Semi  Markov  Chalna.  Semi  Markov  Chains  (SMC)  can  be  used  to  simplify 
CSP^s  and,  specifically,  CSP-1.  Below,  a brief  exposition  of  SMC's  ia 
given.  For  further  details,  see  [6.2,  Appendix  of  6.3,  or  6.6]. 

For  discrete  (end  Integral)  t e 0,^  let  X(t)  be  a discrete)  atochaetic 
process.  Then  we  have 

Deflnitlen  1.  X(t)  is  a finite  'Semi  Markov  Chain  iff  its  state  space 
is  finite  and  and  following  relationship  holds 

ProbtV(n),  W(n)|Y(w),  W(m)j  0 s m S n<-l] 

■ Prob[Y(n),  W(n)lv(n-1),  W(n-l)) 

where  Y(m+1)  ■ X(tjn+i)»  T(m+1)  ■ W(tn+1)-W(m)  is  the  time  of  sojourn  in  state 
Y(m)  from  its  entrance  until  its  exit  to  state  Y(m+1),  t(m4'i)  is  a parti- 
cular realisation  of  the  random  variable  W(m-fl)  which  in  turn  la  the  total 
time  to  (tntl)st  transition,  and  Y(m)  s‘>Y(mH*l)  for  all  m. 

I 

For  further  reference,  we  have 

Definition  2.  Let  i,  k be  In  the  state  apace  of  X(>).  Then 

a.  The  (defective)  pdf  of  the  time  to  transition  from  state  i to  state 


■ Prob(X(t)  ■ kl  X(t’)  • i,  t > t'>  Olx(O)  - i), 
for  i k and  ie  otherwlaa  zero. 

b.  The  probability  of  etertlng  in  state  t at  time  aero  and  being  in 
etete  k at  time  t ia  given  by  . 

" Prob(X(t)  - k|X(0)  - 1). 

. In  Definition  1,  the  prrcees  Y(>)  ie  e MC  celled  the  embedded  MC  of  the 
SMC.  Letting  Ho  be  the  Heavlsldo  sequence,  the  transition  matrix  for  this  MC 

is 

where  the  asterisk  denotes  the  operation  of  convolution.  If  in  Definition  2, 
there  should  exist  at  laast  one  state  k auch  that  Qv  }((■)  is  not  identically 
zero,  then  self  transitions  are  possible  without  being  recorded  by  the  SMC 
apparatus.  In  thle  case,  the  concept  of  a Markov  Renewal  Process  (MRP) 
must  be  used.  Referring  to  Definition  1,  the  MRP  would  be  the  process  (X(>)( 
W(*))*  In  the  rest  of  the  paper,  we  will  bo  dealing  with  aperiodic,  irre- 
ducible, and  stationary  SMC's.  The  definitions  of  all  these  concepts  parallel 
those  for  MC's.  For  further  information  on  MRP'e,  types  of  SMC's  and  their 
relationships  with  their  embedded  MC's,  see  [6.2,  6,3,  or  6.7]. 
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Va  flnlah  this  section  by  stating  two  theorems  needed  later  on. 


Theorem  1.  Given  the  SMC  X(*)»  w*  have 

- E Ql.J  + (fia,k>^"k(0 

i 

vhfre  defined  In  Definition  1|  Is  the 

Kronecker  delta,  and 

Jk(0  ■ Ho*(6o  » E'^k.sHt) 

a 

for  6o(t)  ■ So,t  . 

Proof.  See  [6.2.  6.6.  or  6.7]. 

Theorem  2.  Given  the  SMC  K(.),  the  following  limit  holda. 
llm  P,k(t)  . 

* 

■ «k 

where  a ••  the  unique  eigenvector  with  eigenvalue  1 for  the  embedded  MC  and 
Ptt' * the  mean  time  of  aojourn  In  state  k. 

Proof . See  [6.2.  6.6,  or  6.7]. 

2.3  Simplification  of  CSP-1.  The  first  simpliflcntlon  Is  driven  by  the 
Fraction  Inspection  (FI)  functional  which  la  given  in 

Definition  3.  For  the  model  of  CSP-1  appearing  in  Figure  2,  the 
Fraction  Inspected  (FI)  functional  ie 

N 

FI(N)  - 1 - § ^ C(ulg)(t) 
t-0 

In  the  equation.  N ■ the  total  number  of  units  which  have  paeaed  the  in- 
apection  station  in  real  time,  v ■ 1-f,  and 


C(uls)<t) 


1. 


if  X(t)  Is  in  uls 


otherwise  . 


Taking  the  conditional  average  of  F1(M)  gives  a function  defined  In 


.i 

1 

•i 


Definition  4»  The  Average  Fraction  Inspected  (AFl) , for  the  first  N 
units  and  starting  In  either  MC  state  HO  or  In  any  state  under  equilibrium 
conditions  is 

AFI(N)  - E[FI(N)|M(0)  - HOI 
- Ee[Fi(N)],  also 

where  M(>/  1b  the  MC  prooaas,  E[0  the  expectation  operator » and  e the  long 
run  probability  vector  (pv). 

Concerning  the  first  simplification  of  CSP-1,  we  have 

Theorem  3.  Letting  sc  * 1 and  uls  *2  (see  Figure  2),  we  can  construct 
the  following  SMC  whose  states  are  defined  In  terms  of  the  z transform 
(6.1»  6.2,  or  6.11]. 

States:  (1,  Qi2(z))  and  (2,  Q£i(s)) 


I -J\ 


where  Qi2(z)  - — . QaiC*)  * ITr  * Y “ pq^. 

sl(S“l)+Y  ® ^ 


6 ■ fp,  and  $ ■ 1-6. 

Proof . See  [6.2]. 

Corollary  1.  The  unlimited  sampling  phase  of  CSP-1  can  be  reduced  to 
a MC  state  Si  with  a geometric  pdf. 

Proof . From  Theorem  3,  the. transform  of  the  function  Q2i(b)  o 
(nondef active)  pdf  which  can  be  written  (In  the  time  domain)  as 

Qsi,Ho(t)  - 

In  the  above  equation,  HO  Is  used  since  the  application  of  this  Corollary 
will  be  to  the  MC  model. 

Corollary  2.  Starting  in  state  1 at  time  zero,  the  PI(N)  functional 
in  Definition  3 has  a limit  as  N approaches  infinity  given  by 


Llffi  F1(N)  ■ 1-V02 
- AFI(«) 


[a.e.] 


Proof.  The  first  equality  follows  from  Theorem  2 applied  to  the  SMC 
constructed  in  Theorem  3 and  the  ergodlc  theorem  for  functionals  defined 
(or,  in  this  esse,  definable)  on  SMC's.  The  second  equality  follows  from 


Definition  4 and  the  facts  that  "M(0)  ■ HO"  is  equivalent  to  "X(0)  " 1" 
and  EfC2<t)|x(0)  - 1]  - Piz(t). 


The  second  elmpllf Icatlon  will  be  used  In  Chapter  3. 

Theorem  A.  From  the  MC  model  of  C3F-1,  a SMC  can  be  constructed  with 
the  following  states  (again  in  terms  of  the  z transform) 

Statesi  (a,  ^ba<*)) 


where 


The  transfer  functions  for  the  intermediate  states  c and  d are 


^cd<*>  • (j)  “d  Qca<«)  - (1-  (f ) . 

Proof.  Let  a ■ HO,  c ■ , for  1 st  j 1 1-1,  and  d ■ Si,  Then  a and 

d have  geometric  pdf's  and  are  tnus  (trivial)  SMC  states.  From  [6.2),  c 
is  a SMC  state  with  the  given  transform.  Using  a routine  combinatorial 
argument,  we  have  (dropping  the  argument  z) 

m ^ 

^ba  " ^cd*Qda  fY'  ^^ac'Oca)^^ 

which  reduces  to  the  given  form  by  summation  of  a geometric  series  for 

|z|  > 1. 

2. A MC  Model  for  (2)-Serlal  CSP-1.  We  consider  two  (different)  CSP-l's  in 
tandem:  CSP-1  tPj{*  fj{l  with  MC  and  SMC  states  ^Hjk,  Slk\  and 

^Bk,  b|(^,  respectively.  ■' 

The  (2)-MC  model  of  (2) -serial  CSP-1  is  bSsed  on  the  assumption  that 
the  two  item  characteristics  being  sampled  for  are  independent.  Following 
the  practical  case  discussed  in  Chapter  1,  two  item  characteristics  are 
sampled  for  at  two  successive  stations  along  a production  line,  according 
to  two  (different)  CSP-1  types.  If  an  item  is  rejected  because  of  a de- 
fective first  characteristic,  then  the  second  characteristic  is  not  sampled 
for.  Thus  a transition  to  HOI  occurs  in  the  first  plan  but  no  transition 
at  all  occurs  in  the  second  plan  for  the  given  operational  time  increment 
which  the  item  represents.  However,  if  the  item  passes  muster  for  the 
first  characteristic  (l.e.,  the  item  is  Inspected  and  found  to  be  nonde- 
fective in  the  1st  characteristic  or,  because  of  fi,  is  not  Inspected), 
a transition  takes  place  in  the  first  plan  to  a state  other  than  HOI  (or 
ai)  and  the  item  moves  on  to  the  second  station.  Thus,  in  this  latter 
situation,  a transition  takes  place  in  both  MC's  for  the  specific 
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If 


operational  time  Increment  generated  by  the  unit,  We  tranalate  this  view- 
point into  the  (2)-MC  model  given  In  figure  3, 

Figure  3 

Second  Order  Markov  Chain  Model. for  (2) -Serial  CSF-1 

Stateet  ^(kl,  12),  (kl,  k2),  (11,  k2) , 
for  0 a kj  a lj-1,  j "1,2^ 

Tranaltionai  ((kj)+l  tnay  be  ij,  j ■ 1,2) 


State 
(kl,  k2) 

State 

((kl)+l,  (k2)+l) 

Probabil: 

(11,  k2) 

• >=^ 

(11,  (k2)+l) 

61*12 

(kl,  12) 

((kl)+l,  12) 

qi6g 

(11,  12) 

(11,  12) 

6162 

(kl,  k2) 

((kl)+l,  0) 

qiPi 

(il,  k2) 
(kl,  12) 

>=4 

(11.  0) 
((ki)+i,  0) 

61P2 

qi*2 

(11,  12) 

(11,  0) 

6162 

(kl,  x) 
(11,  X) 

>=4 

(0,  X) 
(0,  x) 

Pi 

«i 

(x  - 12  or  k2). 


,1 


The  result  la  a rather  complicated  2 dlmenalonal  lattice.  The  re- 
nalning  chapters  reduce  the  study  of  this  model  and,  more  generally,  similar 
models  for  (n) -aerial  CSF-1* s end  functionals  defined  on  them  to  a manageable 
syscematlc  analysis  with  various  degrees  of  success.  To  help  In  thla  analysis, 
we  fix  some  more  ideas  in  two  more  definitions  before  leaving  Chapter  2. 

Definition  5.  A (n)-serial  plan  In  the  same  as  a (n)-serlal  CSP  and 
cons lets  of  CSP's  arranged  in  tandem  such  that  the  output  of  the  jth  plan 
is  the  Input  to  the  (j+l)st  plan,  1 a j S n-1.  For  a given  operational  time 
Increment’  given  by  the  movement  of  a production  unit  through  the  sampling 
stations,  a transition  takes  place  in  the  (J4‘l)st  plan  only  if  no  defects 
are  found  in  the  preceding  j plans.  Moreover,  If  a defect  is  found  at  the 
jth  station,  no  transitions  take  place  In  the  consecutive  plans  after  j. 


HowQver,  the  interpretation  of  "virtual  transition"  for  "no  transition"  will 
also  be  used  when  convenient  to  do  so.  If  only  a particular  type  of  CSP  is 
used,  the  serial  plan  will  be  called  a (n)-3P.rlal  CSP-"type".  If  the  CSP's 
are  mixed  types,  the  plan  will  generally  be  written  outt  (CSP-typed))—  ••• 
(CSP-type(n)) . 

Definition  6.  A multleharaetaristlt  plan  (MGP)  will  be  ueed  aa  a generic 
term  whlls  a non-CSP  MCP  will  be  aalled  a variant  MCP. 


liQJ^M.APPROAGHES  TO  (2)>SERIAL  CSP-1.  The  two  approaches  are  given  in 
Sections  3.1  and  3.2.  The  connections  between  them  are  given  in  Section  3.3. 
In  addition,  a and  b are  the  SMQ  states  appearing  in  Theorem  4 for  the  flret 

plan,  A2  is  the  usual  transition  matrix  for  the  second  plan  used  alone, 
and  I2  ie  the  identity  matrix  of  rank  Ig. 

Ivl,  A.verage_.lrAnBit ion  Matrix . Given  the  (2)-MC  model  for  (2) -serial  CSP-1, 
we  first  define  the  matrix  valued  characteristic  functional  in 

Dsdnitlpn  7*  Lot  1)  u be  a reelliatlon  of  the  process  (X(t),  Ma(t)), 
where  X(0  is  the  SMC  variable  for  the  flret  plan  and  MgCO  is  the  MC  vari- 
able for  the  second  plan  and  2)  ProJt(u)>  be  the  projection  to  the  first 
component  at  time  t.  Then  the  matrix  valued  characteristic  functional  is 


CtC“) 


A2»  if  Proj(.(io)  ■ b 
I2,  if  Projt,(u)  - a 


prove 


Using  the  idsa  of  a controlled  MC  (sea  [6.12])  end  Definition  7,  we  can 


Theorem  5.  As  N approaches  infinity, 

N ^ ®a  I2  ®*b  Ag  [■•e.] 

■ t-1 

Proof.  We  can  break  the  matrix  valued  random  variable  up  ae 


Ct  <“) 


.(u)  I2  + bt<w)  Ag 


The  functionals  a^CO  and  bt(0  have  the  obvious  definitions t Bt(iii)  - 0 or 
1 iff  ProjtCw)  • b or  a,  respectively  and  bt(w)  » l-at(u).  Then  the  above 
average  sum  can  be  similarly  decomposed.  The  theorem  than  follows  from  the 
definition  of  the  (2)-MC  model  given  in  Figure  3,  the  SMC  reduction  of  CSP-1 
in  Theorem  4,  and  the  ergodic  theorem  for  functionals  defined  on  SMC's. 


Using  Theorem  5,  an  average  operator  can  be  associated  with  the 
second  plan  In 

Definition  8»  Given  the  RHS  of  the  limit  in  Theorem  5,  the  Averaae 
Transition  Matrix  for  the  second  plan  l« 


3^2  " «« I2  % A2 

Gleerly»  for  the  second  plan,  the  expression  for  ^2  be  looked 

upon  as  stating  that.  In  the  long  run,  I2  (virtual)  "transition 

matrix"  (100)  a.%  of  the  time  while  Ag  appropriate  matrix  for  the 

remaining  (100)0^%  of  the  time.  To  elaborate  somewhat,  I2  can  be  interpreted 
as  the  "Stop"  matrix.  That  is,  when  I2  employed,  no  transitions  taka 
place  as  far  as  production  unit  time  Is  concerned.  A possibly  better  inter- 
pretation Is  to  consider  (virtual)  transitions  as  taking  place  according  to 
the  identity  matrix  but  to  define  the  relevant  functionals  only  Cor  transi- 
tions which  occur  according  to  A2*  latter  viewpoint,  we  then  have 

a path  dependent  nonstatlonary  process  (see  Section  3.2). 

Given  kave 

Theorem  6. 

iiS  (*^2)'"  ' Li 

where  L2  usual  long  run  matrix  for  the  second  plan.  That  is,  the 

columns  are  all  identically  equal  to  the  long  run  probability  vector  (pv) 

la* 

Proof. 


(if. 


However , 

Llm 
Vpw 

Therefore,  Eq.  (2)  and  summabillty  theory  [6.9]  Imply  that  the  limit  exists 
for  Eq.  (1)  and  is  L 2 ' 

Theorem  8 shows  that  the  use  of  the  average  matrix  gives  the  same  long' 
run  reoultb  that  use  of  A 2 Thus,  using  this  first  approach  reaults 

in  a marginal  AFI(m)  vhlch  Is  the  same  as  that  which  would  be  obtained  if 
the  second  plan  wore  to  be  used  by  itself, 


)“-i:(j)«i(i2-A2rAr 

j 

a5  • L,  C2) 


1S7 


3.2  Path  Dependent  Modal,  The  modal  i$  given  by  the  matrices  In 


Definition  9,  The  path  dependent  ma'trieee  for  (2)>serlal  CSP-1  la 
t 

Act.w)  - 3X  *'■  A 2 •) 

k-i 

where  «]((0  «nd  ire  defined  in  the  proof  of  Theorem  5 and  the  matrices 
are  defined  in  Section  3.0* 

Let  the  conditional  expectations  B[*|Xi(0)  a]  and  Operate 

on  the  above  natrleee  to  yield  matriees  Ae^^)  > respectively.  Alee 

let  e°  ■ (li0,<**»0}»  li  times,  end  y;.  be  en  erbicrety  pv  with  ia  entries. 
Then,  a little  reflection  shows  thee  »e  (2)>MC  model  with  initial  pv  • 

[JLI  • £9]  [£l<  equivalent  to  using  A^U)  or  , respectively, 

with  initial  pv  ■ 

Using  the  equality  ■ l-a)((*)"»  we  can  rewrite  the  metricea  in 

Definition  9 as 


t 

A(t;u>)  - A2)  +A2)  (A3> 

Multiplying  the  RHS  of  Eq,  (A3)  out,  we  get 

A(ti»)  - (.J.2  •• — It)  (I2  - A,)' 

— (I  -iKU-AOr-A'  (B3) 

In  Eq.  (B3),  the  a|gumcnts  of  the  ej*e  have  been  dropped  for  notational 
convenience  end  " ie  the  restricted  summation  obtained  by  requiring  that 
Ji  < Ja  < — < Ja* 

From  Eq.  (B3),  e recursive  acheme  can  be  developed.  For  a (complex) 
polynomial  of  degree  n with  roots  tj  (j  ■ 1 to  n),  lot  Si{(ri , — ,r„)  be  the 
kch  symmetric  function  of  the  roots  (associated  with  tlie  variable  of  power 
n-k).  For  simplification  in  using  Eq.  (B3) , define 

Dk(u)  “ 8k(ei » — »#„). 


*Bl  is  SMC  notation  for  Sj. 
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ThuSt  tot  example, 

- I,  - ^aj , and  D{}  ■ "pf  a^ 

J J 

Since  aie(u)  ■ 0 or  1,  we  can  consider  the  RHS  of  Eq.  (B3)  as  a random  matrix 
polynonlAl  ever  thd  binary  field,  Ualng  .the  aymmetrie  function,  we  have 

Tt^aorem  9.  With  the  special  rondom  ayiitmetric  functions  defined  above, 
we  have  a recursive  relationship  between  the  coeffioienta  of  and 

^(ntu)  where  we  treat  (l2’‘A2)  A 2 •*  polynomial  indeterminate. 

Proof . The  recursion  is  obtained  by  expreaalng  f{{xv¥l\\ji)  as  A(0,n$b))* 
A(n,n+lTwV  and  equating  ooefficients,  Explicitly,  the  recuraion  ia  given  by 
Df  1 - 1 

“iSi  ■ *tvfi  + “5 

D^l  . D]>  + .„+i  H k » n 

In  particular, 

, Eq.  (B3)  is  more  useful  for  calculation  of  Eq,  (A3)  because  of 
Propoeltion  1. 

®t»J^l»j2’*“-*«je|x(0)  - a] 

■ ^aa  l ^ ^aa  a“i I ^ ‘ “ * ^aa  ^ ^ a"^  a-1^ • 

Proof.  Since  state  a haa  a geometric  pdf, 

Paav'J)  ■ Prob(Mi(J)  - H01Mi<0)  - HO] 
where  Mj(<)  is  the  MC  process  for  plan  1. 

Corollary  1 . For  the  first  plan,  letting  E(*|x(0)  - e]  ■ Eg(>],  we  have 
Bat»l*2‘“""*an]  ■ p” 


Proof.  Proposition  1 and  definitions. 

Coroilary  2.  With  the  same  conditions  as  Corollary  1, 

EatDB^^]  - 1 


■f 

4 


®a£0;+i3  • P,e(n+1>  + EafOS) 

E.lDfl)  - E.[DjJ]  + E,[DjjE.[a„+ll0i;i3 
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Proof.  Theorem  9 and  deflnitlone. 

Eq.  (C3)  Is,  In  general,  tedious  to  evaluate.  As  this  equation  stands, 
the  probability  of  the  union  of  k overlapping  events  would  have  to  be  evalua- 
ted. Thus  we  try  an  approximation  euoh  that  Eq.  (C3'}  holds; 

e 

®a&k  Ea[an+i|Dgr|  - Eatog]  Ba[an+i]  (C3») 

However,  Eq.  (C3*)  Is  equivalent  to  the  assumption  that  the  random  matrlcos 
A(3 Independent.  Proceeding  with  this  simplifying  assumption,  wa 
get  the  following  nonstationary  • where 

A'(k)  - (P^^(k)l2  + PabWA2)  <D3) 


Concerning  this  MC,  we  have 

Theorem  10.  The  nonstatlonary  MC  whose  matrices  are  given  by  Eq.  (D3) 

Is  strongly  ergodic.  Its  limit  Is  expressed  by 

-|i;A'(K)-La 

where  the  strong  convergence  Is  in  the  sense  of  the  norm  supremum  (or  any 
norm  equivalent  to  it  In  finite  dimensional  Euclidean  space). 

Proof . Each  of  the  matrices  has  the  unique  eigenvector  e^  with  eigen- 
value 1.  From  [6.10]  and  Theorem  8,  the  nonstatlonary  MC  la  strongly  ergodic 
with  the  above  limit  since 

Llm  A'W  « S»  , 
k * 

where  the  limit  is  taken  with  respect  to  one  of  the  above  norms . 

The  nonstationary  MC  in  Theorem  10  is  the  approximation  which  is 
erroneously  stated  to  be  equivalent  to  the  (2)-MC  with  the  expectation 
operator  £.[•].  To  get  some  Idea  of  tha  relationship  between  Eg(A(k,u)] 
and  A (lik) , wa  prove 

Proposition  2.  P|^g(J)  is  e tnonotonlcelly  non-decreasing  function. 

Proof.  Recalling  the  etochastlc  sequence  W(« ) from  Definition  1 and 
letting  Tg^  be  a sojourn  time  In  a until  exit  to  b,  we  have 

APaa<«)  ■ P.a(n+1)  - 

IS 

■ -Qgt, (n)  + ^ ^P(WjCa)  ■ nj 

-P[Wj(a)  + T.b  - n]}  (1) 
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But  the  expression  Inside  the  sumtnation  sign  in  Eq^  (1)  Is 

-P[Wj(a)  + Tab  - n and  Tab’^  0]  s 0 (2) 

Frum  (1)  and  (2)»  tha  Proposition  follows. 

Corollary '3.  Tha  coefflclanta  of  the  nonstatlonary  MC  are  all  lass 
than  or  e^uai  to  the  corresponding  ones  of  the  axpaccad  value  of  the  path 
dependant  modal* 

Proof.  Abbreviating  Paa(')  P(0>  Proposition  2 shows  that 

P(Jl)P<j2) P(Je)  S Pai)P<jj-jl) FOs-J,..!) 

Each  side  is  a general  term  of  the  two  models,  ths  LHS  coming  from  the  non- 
stationery  MC  model  and  the  RHS  coming  from  the  path  dapendent  one. 

3.3  Connections.  The  tranaition  matrix  A 2 Section  3.1  is  clearly  equal  to 

itrong-llm  A*  <k) • 
k 

Tha  connection  between  the  nonstatlonary  MC  and  the  average  of  tha  path  de- 
pendent model  hea  already  been  examined;  the  former  is  obtained  from  the 
latter  upon  asaumlrg  the  Independence  of  the  one  step  random  matrices.  Mon- 
ststionary  MC's  also  arioe  in  Chapter  4 but  they  ere  more  related  to  the 
SMC  reduction  in  Theorem  1 chan  to  the  reduction  given  in  Theorem  2. 


4.0  DIRECT  PRODUCTS  AND  MULTICHARACTERISTIC  PLANS.  In  this  chapter  Ak 
denote  the  transition  matrix  of  j;he  kth  C8P  in  e serial  plan.  Tha  plan 
varleblas  and  parameters  will  also  be  indexed  in  the  same  manner  (a.g.,  p^i 
^k  CSP-1),  will  denote  tha  identity  matrix  of  rank  1|(. 

We  will  use  properties  of  direct  products  without  detailsd  comment  (see 
16.81). 


4.1  (2)-8erial  CSP-1.  The  direct  product  of  two  matrlcee  is  given  by 

Definition  10.  Let  A B nxm  and  rxe  matrices,  respectively.  Then 
the  direct  product  of  A *»d  5 I*  nrxms  matrix 


A ® B 


•uB  .isB 

•mlB  *n2B 


«;nB 

■nmB 
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(with  some  abuse  of  notation  in  using  B rather  than  its  entries),  A direct 
product  is  sometimes  referred  to  as  a Kronocker  product  in  the  case  of 
matrices  and  an  (ulgebrolc)  tensor  product  when  the  factors  are  explicitly 
linear  operators, 

Given  the  (2)«MC  model  in  Figure  3,*  Chapter  2,  Definition  10  can  be 
\ised  to  express  its  transition  matrlK  in  a oompact  form  which  Is  given  by 
the  third  equation  below,  By  construction^  the  (2)'‘MC  matrix  can  be  wrlttan 

as 


P1I2 

<?iA2  0 — 0 

P1I2 

‘iiA2  0 

P1I2 

» a 1 

0 0 —'ll  A 

^ll2 

0 0 — 0iA 

Using  soma  simple  propertlea  of  direct  products » ,we  can  rawrlto  tha  abova 
matrix  as 


Pllj  0"“"0 

pJa  0 — 0 

pjlj  0 — 0 
iilj  0—0 

I 

pjAj  ‘tjAj 

PjAj  0 


P1A2  0--*0 
PiAj  0 — 0 

Pi  A 2 0 — 0 
61A2  0—0 


0—0 
SjAf“  0 

0 — <11 A 2 
O“““0iA  2 


Cl  CS)  ( I2  "A  2)  ^ Aj_  (X)  A ; 


where 


Cl  - <Pi) 


10 0 

1 0 ‘ 0 

1 0 eMMlA  Q 

0 -- - 0 


(an  lix  matrix) 


Concerning  /\i2 

Theorem  11.  Aig  aperiodic,  Irreducible,  and  finite.  Hornier,  If 
^ la  the  long  run  probability  vector  (pv)  of  Aj » J * then  O]®  «2  la 
the  long  run  probability  vector  (pv)  of  A^j' 

Proof.  1)  Refer ‘to  Figure  3,  Chapter  2.  The  state  (0,0)  la  aperiodic. 
It  la‘  atralghtforward  but  tedious  to  verify  that  (0,0)  can  be  reached  from 
any  atate  and  that  from  (0,0)  any  state  can  be  reached.  Thus  Che  matrix 
Is  Irreducible.  Being  Irreducible  and  having  one  aperiodic  etatc  (0,0) 

Imply  aperlodlclcy  for  the  metrlx.  Finally,  It  is  trivial  that  the  matrix 
Is  flnlto  since  Its  direct  product  components  ars.  2)  To  prove  Che  second 
part  of  Theorem  11,  we  uae  Che  fact  Chat  a finite,  aperiodic,  and  Irreduci- 
ble MC  matrix  has  an  unique  eigenvector  with  eigenvalue  1.  By  assumption, 

SjAj  ■ sy»  llfij  Ha  “ Ij  unique, 

|V|,|  and  J ■ 1,2.  Thus  we  have, 

-(ii®  aa)(Ci  0 (la-A^) 

+ (a.i0i2)C^i0  K) 

- *1  Cj  0ia(lrAj  + £i  A10.2A2 


■ «i  Cj  0 0 ■+•  «i  0 £2 


" £i  © £a 

Moreover  the  entries  of  «Sj  ej  are 


where  ||  v H j ■ 

(£i0«4)  Au 


203 


positive  and  add  to  one  by  th^eflnltlon  of  the  direct  (or  tensor)  nrorfi.r*. 
o(  two  vector.  (o£  cour..  -'e,®*,"  require.,  b,  Definition  “""h.t% 
PMof.  0^  be  1 X Ig).  The  uniquenesfl  of  a long  run  pv  finishes  thi 


plan. 


We  now  turn  to  the  investigation ‘of  the  marginal  AFl(-)  for  the  second 


API(>f4!l'lvi^by^°"  CSP-1,  the  marginal 

APIgC*)  ■ 1 - Va«bi  **^8 

(ttba,  1«  used  as  shorthand  since  no  SMC  redaction  is  used  in  the  proof.) 
Proof, 

|c«ik)(e2j)J 


Al®  la 

by  definition. 


l-AP'IjC*)  ■ Llm 
N 


[a.e,],  by  definition. 


U-i  J-i  J 


-V,  (j.,) 


•21, 


[a.e.] 


- V2«bj  eaijj 

Ekcept  for  three  comments,  the  theorem  is  finished. 

if  Mtnpllng  begins  with  state  (0,0)  with  probability  one  or  with 
fia  > <^ben  operating  on  the  characteristic  functional, 

®t*|s]i  where 

S ■ Mi2(0)  ■ (0,0)"  or  "e,  (25  Sj",  allows  the  dropping  of  "[s.s.ll'  Secondly 
1 " 1»2.  Finally,  the  doflnltlon  of  the  functional  implies 

that  ws  are  considering  tho  identity  matrix  as  a legitimate,  but  virtual 

Definition  5 and’ 

We  see  from  Theorem  12  that  the  formula 

1 - V2Bbj  621^ 

le  the  average  number  of  units  which  are  actusll,  inspected  for  the  second 


iiiiitoMia- 


jr 


-•1  I 

i 2 


characteristic.  Thus,  for  the  second  plan  In  tandam,  "not  Inspected"  Is 
not  equivalent  to  "sampled"  taacauee  of  the  control  exerted  by  the  first 
plan  on  the  second  (recall  the  two  interpratatlons  of  the  Identity  matrix 
in  Definition  5).  In  other  words,  l-AFIj (“)  is  the  average  fraction  not 
inspected  whereas  1<«AF1C»,  average  sampled  (equal  to 

vaa2ij>. 

Before  leaving  feWs  section,  an  alternate  "proof"  of  Thooram  11  will 
be  given  which  will,  in  addition,  give  some  insight  into  ths  transient 
behavior  of  the  (2)-eerial  C8P-1  model.  If  a (2>-MC  pv  can  be  expressed 
ea  the  direct  product  of  two  pv'e  (one  for  each  plan),  then  such  e py  will 
be  oalled  eaparable.  Given  that  the  (2)-MC  starting  pv  (initial  pv)  is 
isparable," define  the  pv*e  jj®  end  ae  tha  initial  pv'e  for  the  first  and 
ascend  plana,  raapsctively.'”'  Defining  a vactor  as  a unity  vaetor.  iff  each 


to  tha  column  (row)  vactor  v).  Than? 


(«>  ® i‘)si2i  ■ («»  ® y')B;i®  h * (Ai-  Cl)  ® Aj]  iia 


uj  - go 

..1 


where  go  ■ l-vi  end.,  in  the  seme  way, 

ea)^(x^<lQ  ;i£.^)  - gol®  + (l-8o>i^ 


*Other~waye  of  writing  ^ erat  x ' 2L  X column  vectors, 
end  just 

**  sa»  m 

(XI  \ xivi — XiXij 

I )(yp yij)  • 

Hxl  [xiiyi— *iiyi2_ 

In  all  four  notations,  the  result  is  an  outar  prodiict  which  la  a matrix. 


However , (x ^ 0 bS?1  “ (i’0  y')  " 


which)  combined  with  the  above,  gives 

2C^  ■ and  jr^  ■ ^gQ  I2  + (l“8(j)  A2  ) . 

Repeating  with  x^0 

J5,*  ■ u®  and  (sj  Ig  + (i“Bi)  Ag) 

where  gi  >■  l-v^xl  • In  general,  by  Induction,  we  have 
*■1 

x*^^  ■ and  ^ ^gy  I2  + (l-gy)  A2^ 

where  g(  ■ Thus,  for  a separable  pv,  the  first  plan's  pv  propagates 

according  to  the  plan's  own  Individual  etructure,  In  contrast,  the  second 
plan's  pv  propagates  a's  a nonlinear  function  (because  of  "l-gt^")  of  vectors 
(pv's)  arising  from  both  plans  and  dependent  on  vi.  The  relevance  to  Theorem 
11  arises  from  the  observation  that  for  ail  practical  purposes,  any  pv  for 
the  model  can  be  considered  separable  even  though  theoretically,  there  are 
nonaeparable  pv'e  whose  transflnite  cardinal  number  is  equal  to  that  of  the 
aet  of  all  linear  functions  from  the  unit  Interval  to  itself  (loosely  speaking, 
there  are  an  infinite  number  of  ways  to  factor  a real  number).  Itt  pa"tlcular, 
~1'0  .3:2  separable  (by  construction)  hut  self  replicating. 

The  connection  with  Theorem  11  will  be  completed  by  showing  convsrgence  of 
^ and  ^ to  Sj  and  Oj,  rospectlvoly.  In  the  process,  we  will  see  that  the 
model  can  be  decomposed  into  a stationary  ergodlc  and  a nonstationary  strongly 
ergodlc  MC  thereby  providing  a link  to  the  results  of  Chapter  3.  Rrom  Chapter  3 
and  [6.10] , the  matrix 

r r 

- TU(8.1  I2  + A2) 

J-1  j-1 


- A2<1*0  j 

Strongly  cunvorgea  to  , tinea  | 

I 

Llm  A2  (J)  ■ (l-via2ij)  I2  + (^lOZii)  A2  ;! 

J 

- (AFKDij  + (1-AFI(1))A2  ) j 

- A'i  , afkd  - AFi(-,piiii,fi),  I 

and^A'2)^  (strongly)  converges  to  [.j  by  the  usual  summablllty  arguments.  j 

1 


Thus  ^ strongly  converges  to  (Bj.  The  deeompositlor  of  /\*  into  and 
A 2 (l.r)  is  not  surprising  since  the  first  plan  doesn't  depend  on  the_ 
second  while  the  nonstationary  MC  appears  because  we  are  restricting 
attention  to  the  second  plan  which  does  depend  on  the  first; 


4,2  (nlrSarial  GSP-^l.  These  plane  can  also  be  easily  handled  by  direct 
products.  Befor«r  proving  the  next  theorem,  some  new  matrices  must  first 
bo  provided.  By  extension  c"  direct  products  to  three  or  more  metricee, 
we  define  the  needed  matrices  In 


Definition  11.  Given  (n)-seriel  CSP-1,  the  (ni^seriel  transition 
matrlcea  are 


Ain  • 


where 


Akn  “ 


Pilan 

^lA  2tt 

0 

1 
1 

o 

Pi^ln 

0 

‘^lAzn'"  0 

PlUn 

0 

0 ^liAjn 

0 

0 djA  2n 

Pkl(k+l)n 

*lkA(k+l)n  0 — 0 

- •• 

Pkl(k4-l)n 

0 

‘lkA(k+l)n 

0 

'^k^Ck+Dn 

0 

0““‘‘kA(u+i)n 

M(k+l)n 

0 

0 ®kA(k+l)n 

for  2 1 k a n-1  (and  for  k 1).  More  explicitly, 


Akn  ■ the  transition  matrix  of  the  (n-k+1) -serial  CSP-1,  consisting 
of  CSF-l’.s  k through  n from  the  original  (n)-eerlal  CSP-1. 

For  k ■ Ann  " An*  Moreover, 

Ifk+l^n  * identity  matrix  of  rank  (ik+1*— “*  » for  1 S k < n-2, 

and  of  rank  for  k ■ n-1; 

that  la, 

I<k+l)n  " Iit+i®'"®  In*  1 S k S n-2 


Some  important  relationships  exist  for  these  matrices  In 


Theorem  13.  Given  the  matrices  in  Definition  11,  we  have 

Ain  “ Cx  ® ( l2n"  Axn  ) '*’  Ai  © ^2A  * 

More'  generally I 

Kn  " (^OcflTnA(k+l)J‘*‘  Ak®  ' 1 S k i n-1 

Proof . By  definition, 

A(n-l)n  " Cn-1  ® (inn  " Ann)  An-1  ® Ann 

since  An  Inn.**  In'  backward  Induction  on  k,  the  second  aquation 

In  the  statement  of  tnis  theorem,  and  the  decomposition  of  the  transition 
matrices  according  to  Definition  11  give  the  result  for  fixed  n.  Backward 
Induction  can  be  converted  Into  forward  Induction  by  relabelling.  Double 
induction  can  also  be  done  by  varying  n,  keeping  k fixed,  and  then  proceeding 
by  Induction  on  k,  keeping  n fixed. 

To  determine  tne  long  run  pv  of  <n) -aerial  CBP-1,  Theorem  13  will  be 
usetd,  in  Theorem  14,  along  with 

Proposition  3.  If  6x  B2  tr^itlon  matrices  for  two  finite, 
aperiodic,  and  Irreducible  KC's,  then  Bl©  B2  three  properties. 

Moreove^lf  e,  and  e?  are  the  long  run  pv's  for  Bi  and  B2*  respectively, 
then  0^©  Sj  is  the*  long  run  pv  for  the  matrix  direct  product. 

Proof . It  is  trivial  that  t*he  direct  product  is  finite.  The  other 
properties  follow  from  the  equation 

‘■'heorem  14.  Given  (n) -serial  CSP-1  together  with  the  long  run  pv's, 
for  the  constituent  plans  (la  k a n) , the  long  run  pv  for  the  Berial 
plan  (modal)  is 

11  © I2© — © In 

Proof.  Using  backward  induction  on  the  index  k,  for  fixed  n (and  more 
generally,  ^ubla  induction  on  k and  n)  as  In  Theorem  13,  Theorem  12  shows 
that  e^.i  © e(^  Is  the  long  run  pv  for  A(n-l)n'  first  equation  in 

Theorem  13,  the  equation 

"ll  © l2  © © In  “ © ^12  © — © In)"  I 
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the  fact  that  /\j_  la  a MC  matrix,  and  Proposition  3 applied  to 
suffice  to  finish  the  proof. 

As  an  example  of  Theorem  14,  consider  (3)i-serial  CSP-1.  The  "Go" 

probabilities  for  a tranaltion  In  the  third  plan  arei 

an'l  Si$2  . 

The  "Stop"  probabilltlaa  for  a (virtual)  transition  in  the  third  plan  are: 

Pj.  qip2*  *ll^2’  ®l®2* 

The  matrices  are: 

Ai3  ■ Cl  0 ( I23  “ A23)"**  Aj  0 A23 

A23  “ C2  0 (is'Aa)  A2  0 A3 

Reerranging  Eqs.  (A4)  and  (B4),  expanding  the  "23"  Identity  matrix,  and 
aubatituting  the  rearranged  Eq.  (B4)  into  the  altered  Eq.  (A4)  yield 

Ai3  " C]_  1 2 1 3 (Aj_  " C2  1 3 

(Ai  - Cl)  0 (A2  “ C2)  0 As- 

Looking  at  this  last  equation  and  the  "Go"  and,  "Stop"  probabilities,  the 
first  tsrm  of  the  equation  la  the  "Stop"  matrix  for  transitions  in  the 
second  and  third  plans  together  while  the  second  term  is  the  "Stop"  matrix 
for  tranoitions  In  the  third  pla/i  alone.  The  third  term  ia,  of  course,  the 
"Go"  natrix  for  all  three  plans  together. 

Ws  Investigate,  at  this  point,  an  alternate  "proof"  for  Theorom  14 
analogoua  to  the  one  given  for  Theorem  11.  Firat  of  all,  we  derive  the 
reeuralons  and  the  decomposition  which  result  from  the  assumption  of  an 
initial  separable  pv  fqr  (3)>serial  CSP-1.  The  extension  of  the  results 
to  (n)-aerial  CSP-1  Is  then  easily  obtained. 

Let  x*^0  y*^0  be  an  initial  pv  for  the  (.3)-sariel  CSP-1  model. 
PurthermoTe,  define  the  following  three  sequences  of  vectors : 

- x°  A[i  A^'  ^ ° A3  1 

Also  define  a (.2, 3)  to  be  tha  unity  1 x (1213)  vector,  and  a(k)  the  unity 
1 X vector,  for  1 i k a 3.  Rewrite  the  equations  for  A13  <*nd 

A3.3 " Cj  0 I23  (Aj  - Cj)  0 A23  (C4) 


Ai  0 A2n 
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and 


A23  “ C2  ® I3  (A2"  C2)©  A3 

From  E<i<,  (C4),  w«  hava 

a'®  (x>®  •')  - a'  Cl®  (a'®  i') 


(D4) 


+ (u^  - ^ ® A23> 


f 


81nc«  tha  oofliponenta  of  add  to  ona,  we  hava  from  tha  RHS  of  the 

laat  aquation 

{xl©  (JL^©  1^))  a(2.3)  ■ 


(T)  •■'■(T) 


■ goU)  ■ 

Thua«  aa  before  we  have  * u^.  Since  the  components  of  x^  add  to  one*  wa 
also  have  ^ 

a(l)^  (xl©  ab)  - god)  (Z°®  2°)  + a-goCl))  (2®®  *°  A23) 

“ ©a^ 

Using  Eq.  (D4)  to  evaluate  tha  aacond  factor  of  tha  second  term  of  the  RHS 
of  the  second  to  laat  aquation  gives  the  following  string  of  manipulations. 

(^0)^©  (8°)‘  - (l°©  8°)  A23 

® ■ (2:°©  «®>  A23  8d)  or 

<X®)^  ■ v'  <«e  with  tha  <2)-serial  case) 

a£3i.‘((jL°^©(8®)^  - Sill®  (<2®®  S°>  A23) 

(a®)^  - go (2)  8°  + (l-go(2))wl 

“ s°  ^go(2)  I3  + (1-go (2))  As) 
where  go<2>  ■ i-vaVij,.  Finally, 

(ill©  si)  ai^  - y®(go(l>  I2  + (1-80(1))  A2) 
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and 


a(3)^(y^  z^)  - z^ 

“ (sO<i»2)  I3  +.C1-‘8o(1»2>)  A3) 

vhefe  l-go(l|2)  ■ ?voceading  by  Induetion, 

^r+1  « ^r+l^  j,r+l  « ^r+l  . *r 

whete 

A**2  I2  + (l-grO>))  A 2 • “ l“Vix£^ 

• BtCl*2)  I3  + a-gp<l,2))  A3.  l“Br(1.2)  " (l“8r  <l“8r  (2» 

and  gr^2)  ■ l~V2y|j,‘  Furthermore, 

- jr 0 ^"2  (j)  and  «5  - *°  ^ A'3  C3) 

Therefore  once  again  ^ (atrongly)  c'onvergaa  to  e^,  gr(^)  eonvergea  to 
l-via2£^i  and  gp(2)  converges  to  l’*V2a2d2.  followa  then  that 

l-gj<l,2)  eonvergea  to  (l“Vi02;j^^) (1-V2a2ij^) 

and  that 

A*3  <r)  (strongly)  converges  to  (l-APl(l)  *AFI(2))  I3  + AFI(l)  •API{2)  As* 

From  the  last  statement , the  recursive  relationships,  summabillty  theory, 
and  [6.10],  js^  converges  to  03. 

In  summary,  for  an  initial  separable  pv  for  (3)>seriol  CSP-1,  all 
three  components  individually  converge  to  their  long  run  pv's  which  are 
Indapandent  of  one  another.  Moreover,  analogous  to  Al2.  Al3  decomposes 
Into  one  ergodic  stationary  MC  and  two  strongly  ergodlc  nonstationary  MC's, 
the  third  depending  on  the  first  two. 

The  vector  approach  can  be  generalised  to  (n)"sarlal  CSP-1  as  an  alter- 
nate "proof"  for  Theorem  lA.  However,  the  major  reason  for  the  vector 
approach  is  to  obtain  recursions  and  the  manner  of  convergence.  By  induction, 
one  can  now  easily  show  the  fallowing  recursions  and  decomposition  for  Ain 
with  an  initial  separable  pv  ••  >^*  An  outline  of  the  results 

is  given  below. 

“ *2  A'b<*^>»  1 S s a n 

where 

A*(r)  ■ gr(l,2,--,s-l)lB  + (l-gr(l,2,  — ,s.l))  Ag 
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and 


(l-Vk(x^)l^) 


8-1 

3-1)  ■ Xt 
k"l 

(and  g^(0)  * i)«  Thdn  taking  lltnlta,  wa  have  a "proof*'  for  Theottm  14. 
In  general  then,  /\j,n  deeotnpeaes  into  an'ergodlc  Btatlonary  MC  and  (n-1) 
•trongly  ergodlc  nonatationary  t(0'a  of  Inereasing  dependence  on  the  ele- 
fiiehta  of  all  tha  preceding  HC*e« 


Ue  now  deal  with  AFIjj(w)  in 
Theorem  15.  For  (n) -aerial  C8F-1, 
l-APln(«)  - Vn(abjBb2  — «b(n^X))enin 
(again  ai,^  ia  ehorthand,  1 d j a n-1), 

Proof.  From  Theorem  14, 

110 — 0 In  “ r<eij)(«2k)‘ — *<«na) 


The  rest  of  the  proof  follows  the  logic  of  Theorem  12.  For  example,  the 
functional  is 


C(j,i„)<“l(n-1).  %llc) 


where  J ie  tha  set  of  (n-1) -tuples  of  indices  varying  in  a manner  such  that 
the  rth  index  varies  between  1 and  i^,  1 d r d n-1. 

f 

The  same  comments  made  about  AFl2(**)  and  AFI(»,P2li2if2)  also  apply  to 
AFIn(«)  and  AFI(»,pn{in.fn) • 


4.3  fn)-8erial  cp.  An  example  of  a CSP,  different  from  C8P-1,  is  C8P-2 
given  iiT Flgura'’4T  the  limited  sampling  phase  (abbr.  Is)  requites  sampling 
at 'Some  frequency  and,  in  addition,  has  a "clearance"  number  (for  successive, 
but  not  consecutive,  k nondefactlve  inspected  items).  In  a sense,  the  Is  is 
a combination  of  the  sc  and  uls  phases. 


Figure  4 

Block  Diagram  of  CSP-2 


l4— 


sc 


(1) 


uls 


(4) 


(3) 


(2) 


■*7K 


CuZh^' 


■c  and  uls  ••  as  In  Figure  1 

la  “ limiting  sampling  phase 

Arrows  (1)  and  (2);  As  in  Figure  1 

Arrow  (3) \ If  k units  are  suceesslvaly  inspected 
and  found  to  be  defect  free 

Arrow  (4) i If  the  jth  unit  inspected  Is  found  to  be 
defective » 1 i J s k 

The  Is  phase  can  be  looked  upon  as  consisting  of  k MC  states • Further 
each  statUi  SLJ,  has  transitions  to  HO  and  SL(j-fl)  (or  to  Si  for  J ■ k) , 
given  In  the  s transform  mode,  as  follows  (see  [Si 2]  for  further  details) 

8LJ  to  SL(j'fl)  or  Si  given  b/  "X/(b-v)',  X ■ fq 

8Lj  to  HO  given  by  "d/(s-v)",  d - fp 

As  an  example  I consider  the  (2) -serial  CSP  given  by  a CSP-2  follo^e^  by  a 
C8P-1  (the  reverse  order  is  easy  since  then  the  component  matrix  A2 
Just  the  transition  matrix  for  CSP-2).  The  matrix  for  the  total  plan  is 

Ai2<2.1)  - Cl(2)®  (l2“  A2<1))  + Ai(2)®  A2(D 


Dropping  Indiess  on  the  individual  probabilities,  those  matrices  used  on 
the  RHB  above  which  come  from  use  of  CSP-2  are 


Al<2)  ■ 


Gli 

Gki 

Gik 

Gkk 

where 


p q 0 ' — 0 
P 0 q;  -»  0 

p Q 0 ^ 

000  0 


. • Gkk' 


0 V X 


0 0 0 -'*■* 


Gik  “ 


4 0 0 0 

600  0 


6 0 0 


and  Gra  la  an  rxs  matrix,  Also 


4 0 0 — 


Cl(2) 


0 0 f “““  0 


whara  the  "£”  not  In  column  1 la  in  (col,  row)  (i+1,1) . Formally,  tha 
anulyais  can  proceed  in  a manner  entirely  analogous  to  that  dona  in  Secfinn 
4,2.  For  an  initial  auparable  pv,  the  dacompoaition  of  this  (2)-aerial 
CSP  into  a stationary  MC  and  u nonstationary  MC  also  holds.  More  generally, 
auch  dccompoaitlons,  analogous  to  the  one  which  holds  for  (n) -aerial  CSP-1, 
hold  fur  any  (n)-Barlal  CSP, 

4.4  Variant  Multlcharacteriatic  Plana.  The  first  plan  that  would  a cam  a 
nalTural  varlont  is  one  whose  MC  matrix  is  given  by 

Ai2  ■ A].®  A2 

with  this  plan,  the  state  determinations  for  each  component  are  independent 
of  one  another.  By  Proposition  3,  the  above  matrix  is  irreducible,  finite, 
and  aperiodic  with  long  run  pv  £2* 

Another  poaslble  variant  is  given  in  Figure  5. 
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Figure  5 

Variant  MulticlmrncterlBtic.  Plan 


Stafii  9ame  at  In  Figurt  3 
. Trtnaltl.onj  ((kj)+l  nay  ba  ij  for  - 1,2)  t 


State 

Stats 

Probability 

(kl,k2) 

((kl)+l,(k2)+l) 

Wasstb 

(il,(lc2)+l) 

(kl,12)  ' ' 

((kl)+l,i2) 

(.11, 12) 

(11,12) 

Any  of  above 

(0,0) 

l-Frob(state) 

The  transition  rules  In  Figure  5 can  bo.  rostatedi  transitions  take  place 
iff  both  oharactorlstlcs  are  each  either  inspected  and  found  nondefective  or 
sampled.  If  ve  let  li  * I2  ■■  li  this  plan  has  ono  ergodlc  class  given  by  the 
diagonal  ordered  pairs  1 {(J .j) [o  :S  J £ 1 } $ all  other  states  ore  transient. 

Moreover,  if  the  inspectlm  B*'‘irb8  off  with  the  state  (0,0),  we  then  have  e 
plen  equivalent  to  CSP  1 with  p ■ l-qjq»  and  B ■ BiBj.  However,  with  thle 
plan,  marginal  AFI  has  no  meaning  because  of  the  ambiguity  expressed  by  p 
and  l-B.  It  is  even  doubtful  whother  the  traditional  AFI  function  would 
be' a good  measure  of  effectlvonesa  for  such  a plan. 


3.0  CONCLUSION.  The  motivation  for  this  paper  is  Chapter  3 even  though  the 
mein,  workable  resulta  are  contained  In  Chapter  4, 

3.1  Chapter  Three.  The  two  modele  considered  In  Chapter  3 employ  BMC  reduc- 
tion in  an  attempt  to  almpllfy  the  second  order  MC  model  at  the  end  of 
Chapter  2 end  highlight  the  dlfforoncs  between  It  and  the  (approximate) 
modtl  given  by  the  nonetatlonary  MC.  Any  simplification  of  the  (2)-MC 
modal  by  using  SMC  reductlms  for  both  plans  would  probably  not  be  worth 
the  effort  since  superlmpoelng  two  ijl^.i^jndenjt  SMC's  Is  quite  a complex 
process  in  Itself | here,  of  course,  the  SMC^s  are  dependent t 

If  we  ere  only  Interested  in  the  long  run  caee  (Ignoring  the  transient 
esse  which  Is  hard  to  analyse  anyway),  SMC  reduction  of  both  plans  can  be 
used  to  yield  a model  consiatlng  of  the  atatus  /(a,l),  (a, 2),  (b,l),  (b,2)^ 
where  the  letters  and  numorels  refer  to  the  second  end  first  SMC  reduction,^ 
raspectlvaly.  In  Chapter  2.  This  model  would  replace  the  pdf's  of  states  b 
and  1 by  geometric  pdf's.  Tho  conditions  to  be  satisfied  for  this  change  arc 

*•  Uu  and  -7-  - p, 

“I 


The  (q')'s  are  to  be  determined  given  the  standard  mean  times  and 


More  results  on  produyta  of  random  matrices  may  be  found  in  [6,4] 
Where  various  types  of  Isdapandinea  asiumptions  are  Invoked, 


Chap.tS.t..Fouif.. 


As 


a consequence  of  the  theoremi  the  expression  **V2e2i."  has  two  lnterpretB~ 


One  main  result  la  Theorem  12  (and  Theorem  14) , 

V2«2l2' 

tlonst  the  average  fraction  sampled  In  the  usual  sense  and  the  average 
fraction  not  Inspected  In  the  serial  eanso,  The  other  rain  result,  not 
formally  stated  In  any  theorem,  la  the  decomposition  of  any  (n) -serial  CSP 
Into  a eequenca  of  MC'e,  the  first  etatlonaty,  the  remaining  nonstatlonaty. 

The  <2)-MC  model  aesumes  that  the  characteristics  are  independent. 

This  condition  can  be  relaxed  If  the  ordered  pairs  remain  Independent  but 
the  two  elements  of  any  particular  pair  are  allowed  to  be  correlated.  Let 
■ (Xj,  Yj)  be  the  description  of  the  jth  unit.  That  la,  Xj(Yj)  ■>  0 or  1 
Irf  the  first  (second)  characteristic  le  nondefective  or  defective,  re- 
spectively. The  relaxation  Is  equivalent  to  the  assumption  that  the  Zj 
form  a Bernoulli  process  but  that  Xj  and  Yj  are  not  independent.  Then, 
using  the  definitions  of  correlation  coefficient  and  conditional  probability 
<oJ,  ■ Pj^q^ji  W • 1,2)  we  have 


P[Yj 


1 X^ 


0] 


P2*  (ggioa  FiFg) 

<ii 


and 


P(Yj  • o|Xj  ■ 0]  - «l2  - 1-pJ 


Now  ?2  < P2  (ov  ^ Fj)  iff  r(3j02  > 0 (or  < 0)  and  then  iff  the  chareoterls- 
tics  are  poeltlvely  (or  negatively)  correlated,  Xn  particular.  If  r > 0 
(or  < 0),  then  AFlaC”)  will  be  eiheller  (or  larger)  than  that  obtained  in 


Chapter  4,  We  finally  note  that  for  random  variables 
la  equivalent  to  independence, 


Xj  and  Y^ , uncorreletsd 


In  the  variant  case,  e (2)-characterl8tlc  plan  is  given  where  the  very 
meaning  of  marginal  AFI  is  nonuxletent.  Such  a plan  might  be  uaaful  for 
oasne  of  large  positive'  correlation. 


...jus;:. 
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ABSTRACT.  Tha  Janai>Stain  aatliuatsr  Improvaa  tha  axpaotad  maan 
aquara  error  of  k ^ 3 Indapandant  aampla  tnaana  for  all  poaalbla  eonr 
blnatlona  of  true  naana.  In  aplta  of  thlai  it  la  not  widely  uaad  In 
practical  applloatlonai  partly  bacauaa  no  oonfidanca  intarvala  aeootfr* 
pany  it.  Wa  derive  Interval  aatimataa  In  thla  paper  baaed  on  an  un- 
Infornatlva  prior  dlatrlbutlon  and  illuatrata  tha  uaa  and  auooaaa  of 
tha  method  In  an  application.  Not  only  la  tha  aatlmator  about  three 
tinea  aa  afflelant  aa  tha  aampla  maan  vector  In  thla  axnnplat  but  tha 
intarvala  provided  are  37  percent  ahortar  while  containing  tha  true 
valuaa  with  greater  frequency  than  nominally  claimed.  Tha  prior  la 
uaad  In  tha  final  aaction  to  extend  tha  Jamaa-Stain  aatlmator  and  to 
provida  interval  aatimataa  for  tha  caaa  whan  cha  unknown  paramatara 
are  axohangaabla  but  cha  aampla  maana  have  unequal  variancaa. 

1.  INTRODUCTION.  The  Jamaa»Stain  aatlmator  (1961)  of  the  naana 
of  k 3 Indapandant  normal  dlatrlbutlona  la  wall-knowtn  for  being  uni- 
formly and  aubatantially  batter  than  the  aampla  maan»  on  tha  baala  of 
ita  axpaetad  aun  of  aquarad  ervora.  Tha  Jamaa-Stain  aatlmator  and  Ita 
ganarallaatlona  apply  to  many  altuationa  involving  linear  modala,  and 
offer  mean  aquarad  error  Improvamanta  over  the  olaaaioal  aatlnatora  in 
many  of  tha  applieationa  of  atatlatiea.  Navarthalaoa»  an  Informal  poll 
of  parhapa  ISO  atatiatlolana  at  this  eonfaranoa  revealed  that  only  one 
(I  would  ba  a aaoond)  had  aver  uaad  a Stain- Ilka  aatlmator  in  a raal 
application. 

Why?  Folia  of  other  groupa  of  atatiatlolana  probably  would  yield 
similar  raaulta,  although  oubjaotiva  Bayaalans  and  ridga  analyata  may 
use  related  methods  more  frequently  in  actual  data  analyais.  Tha  rea- 
sons certainly  include  unfamiliarity  on  tha  part  of  many  statisticians 
with  tha  methoda  and  tha  typed  of  applieationa  for  which  tho  Jamoa- 
Staln  estimator  in  particular!  and  multlparameear  aatlmetion  in  ganarali 
la  bast  suited.  Long  aecaptanca  of  tha  aampla  mean  and  ita  almpliolty 
makes  statistiolans  reluctant  to  rajaot  it  in  favor  of  a more  compli- 
catad  and  Imparfactly  undarstood  method.  Furtitormorei  tha  use  of  tha 
Jamaa-Stain  astimator  requires  making  judgments  about  which  problams 
to  combine!  which  not  tOi  and  tha  ehoica  of  origin  bo  shrink  toward. 

If  these  judgmanta  are  not  goodi  Chen  tha  Jamaa-Stain  aatlmator  will 
improve  on  tho  total  maan  aquarad  error  of  tha  sample  maan  inaignlfl- 
oently!  and  con  ba  much  woraa  for  soma  coordinates.  These  rauaonn  for 
tho  nonuaa  ot  tha  Jamaa-Stain  estimator  in  applloationa  are  diacussad 
more  fully  In  Bfron-Moiris  (197S,  Sacs.  !•  5). 


*Thia  work  was  partially  aupportod  by  n urant  from  tlie  II. S.  Dapnrt- 
mont  0*'  Health,  Rducfltion,  and  Vfelfnrn,  '•laHlrln!'! on,  n.C. 
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Bvtn  thos«  fanlllar  with  tha  Jamaa-Staln  aatlmator  often  do  net 
use  It  In  ita  almplaat  form  becauae  the  asaumptlona  made  for  Ita  der- 
ivation uaually  are  not  met.  Xnataad,  a generalization  uaually  muat 
be  derived  to  estimate  an  appropriate  origin • to  account  for  nonnormal 
dlatrlbutlona,  for  unequal  varlaneee  of  the  ebaarvatlonai  for  unknown 
varianeea  of  the  obaervatlonoi  for  regression  situations > for  multi- 
variate data»  or  for  another  variation  of  the  assumptions.  Ra cent  pro- 
gress in  providing  these  generalisations  has  not  yet  had  much  impaet. 
Furthermore,  the  generalisations  derived  by  different  researchers  are 
not  always  In  agreement  because  they  are  not  dorlved  from  any  single 
principle.  It  seetBS  to  me,  however,  that  data  analyate  probably  will 
find  the  empirical  Bayaa  viewpoint  most  useful  both  for  identifying 
appropriate  eituatious  for  using  tha  Jamas-^Stein  rule  and  its  genaral- 
Isatlone,  and  for  deriving  appropriate  genera^liatlons.  For  that  rea- 
son tha  empirical  Bayes  viewpoint  has  been  used  in  moat  of  my  papers 
with  Professor  Efron  (March  1972,  August  1972,  March  1973,  Nevenber 
1973,  1973,  March  1977,  May  1977)  on  this  topic. 

Another  deterrent  to  using  the  James-Stein  estimator  Is  that 
daapite  Its  ability  to  reduce  mean  squared  error,  no  methods  havs  been 
developed  for  estimating  the  precision  of  tha  estlmatse,  or  for  datar- 
mlnlng  oonfldanoe  Intarvala.  (Some  attempts  havs  bsen  made  by  Stein 
(1962,  1975,  1974),  but  tha  results  there  are  largely  theoretical  and 
asymptotic.) 

The  primary  purpose  of  this  paper  Is  to  provide  a method  for 
deriving  interval  estlmatse  for  the  unknown  paremeters  estimated  In  a 
matter  similar  to  that  of  James-Stein  and  to  illustrate  tha  reeulta  on 
date.  Thle  Is  done  In  Section  2,  ualng  formal  Bayesian  Ideas.  The 
Improper  prior  distribution  uaad  Is  not  chossn  subjectively,  however, 
but  Is  chosen  becauae  It  yields  an  estimator  similar  to  the  Jamss- 
Btaln  estimator,  becauae  tha  resulting  estimator  is  minimax  (uniformly 
dominating  tha  vector  of  sample  means)  and  admissible,  because  It 
should  lead  to  conservative  interval  eetlmatss,  and  because  It  results 
In  easily  computable  statistics.  It  has  bean  considered  previously  by 
several  authors  Baranohik  (1964),  Stein  (1962),  Leonard  (1974). 

The  discussion  In  Sactlon  2 is  cantered  on  the  problem  of  esti- 
mating the  true  batting  averages  of  eighteen  baseball  players.  These 
data,  which  wera  used  before  In  Bfron-Morrls  (1973),  are  ideal  for  this 
work  because  tha  true  values  are  avallabla.  The  "confidence  intervals" 
derived  by  the  methods  of  Section  2 are  about  37  percent  shorter  In 
this  problem  than  those  for  the  sample  mean  and  they  contain  tha 
true  valuas  with  the  proper  probability.  Since  the  true  values  were 
not  nhosen  from  the  prior,  the  results  sncourags  the  idea  that  this 
method  may  ba  used  generally.  Such  a recommendation  must  await  further 
research. 

The  prior  distribution  also  is  uaad  in  Section  4 to  derive  a 
fflultlparamatsr  estimator  for  parametera  which  have  an  exchangaabls 
distribution,  but  whose  esmplt  means  have  markedly  unoqual  variances. 
While  thu  resulting  estlmstes  snd  Interval  eetlmates  in  thle 
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llluBtratlon  compare  favorably  to  the  aample  mean.  Section  A la  Intended 
only  to  Illustrate  the  use  of  this  method.  The  resulting  rule  la  known 
not  to  be  mlnltnax,  however,  and  its  properties  await  fuller  investigation. 
Still,  the  method  appears  to  be  as  good  as  any  suggested  to  date  for 
gaheralising  the  James-Stein  astimator  to  the  oast  of  unaqukl  varlanoaa, 
and  it  does  permit  construction  of  Interval  estimates. 

2.  A WOSKED  BXAkPLBt  EMPIRICAL  UUS  ISTERVAl  88TIHATB8_  FM  TM 
BATTING  AVBRACS8  Of  EIGHTEEN  BASEBALL  PLAYERS. Let  us  consider  the 
problem  of  estimating  the  true  moans  of  k normal  distributions, 

having  observed  the  independent  sample  means  X^,  X2,  .•>•  X^^'  Bach  X^ 

la  assumed  to  have  the  same  variance  V which  is  known.  Thus,  given  6^, 

NOjl*  V)  1*1,  2,  ...,  k.  (2,1) 

The  simplest  version  of  the  James-Stein  estimator  (1961)  applies 
When  k > 3 and  requires  making  t pvioH  guesses  ^2  * • • ' I 
0J4*  Then  6^  is  estimated  by 

®i,JS  “ “ ^‘l^ 

with 

fijg  IS  <k-2)V/l(X^  - <2. 3) 

The  value  in  (2.3)  determines  how  much  X^i  should  be  shrunk  toward 
Whenever  exceeds  unity,  it  should  be  replacnd  by  1 in  (2.2) 

The  usual  estimator  of  9^^  is  X^,  being  the  best  unbiased  estimator, 

the  best  fully  invariant  estimator,  the  maximum  likelihood,  the  leeat 
squares  and  the  Oauss-Markov  estimator.  It  is  mlnimax  with  the  ex- 
pected B\mt  of  squared  errors,  the  "risk,"  being 

- 6j^)^/V  « k.  (2.4) 

The  subscript  6 on  the  expectation  operator  indicates  that  9^,  • • . , 9jj 

are  fixed  and  ...,  Xj^  vary  aca  (’ding  to  (2.1).  The  James-Stein 

eetlmator  ie  uniformly  better  by  thle  crlterlcn,  heving  risk 

which  is  less  than  k,  stnoa  Bjg  > 0 always.  If  9^^  ■ for  nil  i,  than 
■■  1 raiulting  in  e risk  equal  to  2. 
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If  the  statistician  prefers  not  to  guess 
(say) , he  may  estimate 

(2.2)  to 


at  the  but  believes 

p,  by  X ■ L X^/k  and  modify 


\.JS  *>•  «.6) 


defining 


Bjg  m (k-3)/S, 


S B KX^  - X)^/V. 


(2.7) 


This  version  of  the  James-Stein  estimator  appjLles  only  if  k > 4 (one 
degree  of  freedom  is  lost  In  estimating  p.  by  X) , but  It  ordinarily  would 
be  preferred  to  (2.2)  in  applications  to  data.  Its  risk  Is 

■ I"  - 

dominating  the  risk  (2.4)  of  the  sample  means.  If  6^  ■ . . . > it  is 

easily  checked  from  the  chi-square  distribution  that  <■  1 and  hence 

that  (2^8)  la  equal  to  3.  Otherwise  (2.8)  increases  from  3 to  k as 
£(t)j  -*  6)^  increases.  Once  again,  it  is  better  to  modify  (2.6)  so  that 

every  6^^  is  estimated  by  X in  the  event  that  Bjg  > 1. 

The  estimator  (2.6)  was  applied  in  Efron-Morrls  (1975)  to  the  base- 
ball data  of  Table  1.  The  observations  X.  In  the  second  column  are  the 

it  ^ 

batting  averages  of  18  batters  in  1970  after  45  attempts.  The  variance 
of  each  X^  Is  known  to  be  V ■ (0.0659)^.  The  batting  averages  for  these 

players  during  the  remainder  of  the  season,  considered  to  be  the  true 
values"  will  be  presented  later. 

Instead  of  the  James-Steln  estimator  (2.6),  the  one  rcconmndad  in 
this  paper  for  k > 4 uses 

- X + (1  - B)(X^  - X),  (2.9) 


as  in  (2.6)  but  replaces  (2.7)  by  the  smaller  value 


Actually  the  values  X^  In  Table  1 aro  minor  adjustments  to  the 
observed  averages  after  45  appearances  given  by  X^^  » 0.4641  + 0.0659,^/45  * 
arcsln  (2p^-  1),  rounded  to  three  significant  figures.  The  observed 
average  actually  is  p^;  for  example,  “ 18/45  ••  0.400  for  player  1 
(Roberto  Clemente).  The  arc..  ansformatlon  atabillees  variances,  aa 
required  for  assumption  (2.1),  and  the  constants  0.4841  and  0.0659  are 
chosen  so  that  the  fX.}  and  the  {pj;^}  have  the  same  mean  (0.26567)  and 
standard  deviation  (0T0659).  The  same  transformation  - 0.4841  + 
0.0659,y/45  arcsin  (2p  -1)  was  made  to  the  true  values  p^^,  being  the  pro- 
portion of  successes^during  Che  remainder  of  the  season  for  batter  1. 

The  names  of  the  players  and  other  information  about  this  probleTu  are 
contained  in  Efron-Morrls  (1975). 


Table  1 


THE  MAXIMUM  LIKELIHOOD  ESTIMATES  (MLE) , EMPIRICAL  BAYES  ESTIMATES  (EBE) , 
AMD  TWO  ESTIMATES  OF  THE  EBE  RISK  FOR  EACH  OF  BICHTEBN  BASEBALL  PLAYERS 


(1) 

1 

(2) 

MLE 

"l 

(3) 

EBE 

Si 

(4) 

A 

(5) 

(6) 

A 

(7) 

K 

1 

0.395 

0.308 

0.046 

0.203 

mi 

MM 

2 

0.375 

0.301 

0.044 

0.145 

3 

0.355 

0.295 

0.043 

0.097 

0.424 

0.685 

4 

0.334 

0.288 

0.042 

0.057 

0.398 

0.287 

5 

0.313 

0.231 

0.041 

0.027 

0.379 

-0.006 

6 

0.313 

0.281 

0.041 

0.027 

0.379 

-0.006 

7 

0.291 

0.274 

0.040 

0.008 

0.367 

-0.198 

8 

0.269 

0.267 

0.040 

0.000 

0.362 

-0.274 

9 

0.247 

0.260 

0.040 

0.004 

0.365 

-0.234 

10 

0.247 

0.260 

0.040 

0.004 

0.365 

-0.234 

11 

0.224 

0.252 

0.040 

0.021 

0.376 

-0.067 

12 

0.224 

0.252 

0.040 

0.021 

0.376 

-0.067 

13 

0.224 

0.252 

0.040 

0.021 

0.376 

-0.067 

14 

0.224 

0.252 

0.040 

0.021 

0.376 

-0.067 

IS 

0.224 

0.252 

0.040 

0.021 

0.376 

-0.067 

16 

0.200 

0.244 

0.041 

0.052 

0.395 

0.243 

17 

0.175 

0.236 

0.043 

0.100 

0.425 

0.714 

18 

0.148 

0.227 

0.045 

0.168 

0.469 

1.391 

MEAN 

0.266 

0.266 

0.042 

0.056 

0.397 

0.274 

STDEV 

0.068 

0.022 

0.002 

0.060 

0.038 

0.593 

COMPUTATIONS;  k - 18, 

m ■ 7.5,  V 

- (0.0659)^, 

X 

- 0.26567 

, s - 

- X)^/V  - 18.93244, 

1 - - 6.76  X 10“®,  6q  j(S)  - 3720.30214,  j(S)  - 6.77428, 

ft  - •^(1  - 1/e^  j(S))  - 0.79229  x 0.85238  - 0.67534. 

- Bx  + (l-B)x^  - 0.17941  + 0.32466  X^,  - (X^-X)VsV  - (X^-X)^/0. 08222, 

V ■ I2B  - 13(l-B)/e^  5<S)]/S  - 0.63178/S  - 0.03337  - (0.1827)^ 

R*  - -jI  + yI  (1-B>  + P^vS  - 0.36218  + 0.63178  P^, 
eJ(X)  - (V  * R*)^  - 0.03966(1  + 1.7444  P^)^, 

R^-1-  2-^B  + P^  8{2v  +.  B^3  - -0.27563  + 9.89823  P^, 

::  rJ/R  - 0.39728,  Z R^/k  - 0.27427. 
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(2.10) 


S ■ ^ (1  - 5-^) , m 

in 


k-3 
2 ’ 


where  for  S b Z(X^  ~ X)^/V  we  hnve  defined 


e.(S)  B m e:^(S/2)  f b’ 

B •'O 


itt^l 


exp(-BS/2)dB. 


(2.11) 


The  theory  behind  thle  eetlaetor  will  be  preeented  In  Section  3. 

Here  it  will  be  deecrlbed  end  Ite  epplloetlon  llluetreted.  The  function 

e_(S)  Inereeeee  with  S from  e^(0)  ■ 1 at  8 ■ 0 to  Infinity  M S •«  •». 

In  in  ^ 

Thus  t^(8)  > 1 alvAyt  and  thtvafors  I in  (2*  10)  iihrlnkm  X.  toirard  X lest 

then  the  Jenee-Steln  eetltnator  doea.  One  can  coogpute  a^(S)  by  direct 

Integration,  or  by  ualng  tab lea  of  the  chl-equere  dlatrlbutlon,  of  the 
Ineonplete  geanta  function,  or  of  the  confluent  hypergeometrlc  function 
t4[a,  b,  i) t AbremowltB-Stagun  (1965,  Chapter  13),  elnce 


e^(S)  - r(ari‘l)(|)”‘ exp(S/2)  J 


S/2  .m-1 


(2.12) 


■ M(l,  Bri-1,  S/2)  « r(nri"l)  E (S/2)J/r(»fl+J) . 

J-0 

However,  It  uaually  le  eimpleet  to  conputa  It  reuurelvely  from 

2n 


S 'Vi<s>  ■ 


ualng  the  Initial  valuee 

ej^(S)  « (exp(S/2)  - 1)(2/S), 

e 5(8)  - (-^^  exp (8/2)  - .3). 


S' 


(2.13) 


(2.14) 


(2.15) 


t(x)  being  the  cumulative  dlatrlbutlon  function  of  a atandard  normal 
dlatrlbutlon.  For  large  valuaa  of  S,  the  approximation 


1 - *Cy/ff)  * (2TTS)“i  axp(-8/2)(S+l)/(S+2) 


(2.16) 


may  be  uaed  In  (2.15),  Abramowltz-Stegun  (1965,  p.  932).  For  email  valuea 

of  S,  e (S)  * 1 + 8/2(»H)  + 82/4(»fl)  (mf2) , from  (2.13),  eo 
m 


(2.17) 


2 n. 

Ignoring  terma  of  order  S . Hence  B decreaaea  monotonlcally  from 
m/(mfl)  “ (k-3)/(k-l)  at  S • 0 to  0 an  S u.  The  reader  la  cautioned 
about  the  uae  of  (2.14)  for  small  valuea  of  S.  It  can  be  numerically 
unstable  In  such  casea,  and  then  (2.13)  should  be  uaed  Instead. 
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Using  the  values  of  {X^}  In  Table  1,  we  calculate 

X - 0.26567,  S - E(X^  - X)^/(0.06S9)^  - 18.93244,  m - (k-3)/2  - 7.5, 

I - *(v^)  « 6.76  X 10’*  from  (2.16),  # 3(8)  - 3720.30214  from  (2.15), 

e_(S)  ■*  Cy  .(S)  “ 6.77428  by  Iteration  of  (2.14)  seven  times, 

Bjg  - 15/S  - 0.79229,  ft  » 0.79229  * 0.85238  • 0.67534, 

- 0.26567  + (1  - 0.67534) (Xj  - U. 26567)  - 0.17941  + 0.32466  X^. 

In  this  case,  (2.9)  shrinks  the  MLE  toward  the  grand  mean  only  85.238 
percent  as  much  ae  the  James-Steln.  estimator  (2.6)  does.  The  values 
are  recorded  as  the  empirical  Bayes  estimates  In  tha  third  column 

of  Table  1. 

What  praelslon  should  be  attached  to  tha  estimates  just  derived? 

Ttj|e  error  of  estimate  we  will  use  la  given  in  column  4 of  Table  1 as 
o"(X),  computed  as  follows.  Define 

V B [2D  - (l-B)(k-3)/eJS)]/8  (2.18) 

and 

rJ  - ^ ^ (1-B)  + P^vS  (2.19) 

where 

P^  « (X^  - X)^/E(Xj  - X)^  - (X^  - X)^/V8.  (2.20) 

Then  a^(X)  is  defined  to  be 

oJ(X)  - (VrJ)^.  (2.21) 

From  Che  values  already  obtained,  we  compute 
V - 0.63178/S  ■ 0.03337,  rJ  - 0.36218  + 0.63178  P^, 

oJ(X)  - 0.0659  (0.36218  + 0.63178  P^)^  - 0.03966(1  + 1.7444  P^^)^. 

The  values  [P^} , Which  are  recorded  In  column  5 of  Table  1,  measure  in 

relative  terms  the  squared  distances  from  the  individual  means  to  tha 
grmd  mean.  The  precision  (2.21)  is  better  for  those  components  1 having 
Xj  near  the  center  ^ of  the  data.  This  feet  is  completely  analogous  to 
a^slmllar  result  in  linear  regrsssion,  that  prediction  errors  are  smeller 
near  the  mean  of  the  explanatory  variables.  Values  of  o*(X)  appear  in 
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if 

column  4 of  Table  1.  A player  at  the  mean  would  have  cr^^CX)  “ 0.03966, 

but  player  number  1 la  fartheat  from  the  center  with  P.  - 0.203,  and 
* * 
therefore  has  Oj(X)  0.046,  16  paroint  larger.  The  typical  value  of 
a * 

o^(X)  la  about  0.041S,  or  37  percent  leaa  than  the  atendard  deviation 
0.0659  of  X^.  Thua,  a conalderable  improvement  in  preolalon  la  claimed, 
equivalent  to  ualng  th/'  ijanple  meana  of  a aample  2.52  tinea  aa  large. 

A 

Formula  (2.19)  la  one  of  two  eatlmatea  of  the  rlak  (1^  - 6^)^/V 

of  the  enplrlcal  Bayea  aatlmator  (2.9).  Theae  valuea  ara  given  aa 

In  column  6 of  Table  1,  and  are  leaa  than  the  rlak  of  the  aample  average 
Eg(Xjj^  - e^)*/V  ■ 1 for  every  player. 

In  column  7,  the  unblaaed  aatinatea  of  the  rlaka  of  are  given, 
computed  from  the  formula 

fi^el“2~B  + P^I2v  + B^lS.  (2.22) 

The  aatlmator  In  (2.22)  la  the  unique  unblaaed  aatlmator  of  the  equated 
error  rlak  of  the  aatlmator  (2.9).  That  la 

- Ej^(S^  - e^>^/V,  (2.23) 

for  all  fixed  (6^ Oj^)  • Summing  the  valuea  of  (2.23)  over  all  k 

playera,  with  (2.18)  aubatltuted  in  (2.22),  we  obtain 

- r Rj  - k - (k-3)[fi  + (2-B)/eJS)].  (2.24) 

Since  < k for  all  (X^,  ...,  X^^) , and  R^  la  unblaaed  for  the  rlak  of 
(2.9),  It  follows  that  (2.9)  la  a mlnlmax  eetimator  of  (6^,  ...,  6^) 
for  k > 4.  That  la 

BgS(0^  - e^)^/V  - k - U-3)E[B  + (2-B)/ej||(S)]  <k  (2.25) 

for  every  aet  of  valuea  (Gj^,  ...,  ^).  The  mlnlmax  character  of  (2.9) 

waa  proved  by  baranchlk  (1964). 

Clearly  the  valuea  R^  In  Table  1 ere  unreasonable,  being  negative 

estimates  of  a positive  quantity  in  the  central  11  of  the  18  cases.  With 
other  data  these  eatlmatea  mlg^t  look  batter,  but  they  generally  tend  to 
be  quite  variable.  The  smoother  values  R*  provide  more  reasonable  esti- 
mates of  conq)onent  rish,  although  aa  a group  they  tend  to  be  conservative, 
for  the  following  reasons.  The  sum  of  the  valuaa  R*  can  be  written 
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(2.26) 


R*  ■ Z rJ  - W - <k-3)[B  + (l-§)/8^(S)], 
or  using  (2.24) I 

R+ " + 2m/Ojj|(S) . (2.27) 

A 

It  follows  from  (2.27)  thst  ovsrsstlmatss  ths  total  risk  of 

(e^,  ...i  ^),  sines  8.^  is  unbiassd  for  this  risk.  For  ths  data  of 

Tabls  1|  we  calculate  8.  ■ 4.937  from  (2.24),  R^  *•  7.1S1  from  (2.26), 

* * 

and  2m/e.(S)  « 2.214.  The  amount  2m/e.(S)  that  R.  overestimates  the 
tn  a T 

total  risk  decreases  as  8 Inorsasss,  and  would  tand  to  be  smaller  for 

moat  examples,  where  the  true  values  are  likely  to  be  more  dispersed. 

How  well  does  this  analysis  do?  Ths  true  values  are  given  in 
Table  2,  column  2.  Column  3 presents  the  values  (8^  - 6^)/o^(X),  a 

distribution  which  Ideally  has  sero  mean  and  unit  standard  deviation. 

The  mean  of  these  values  is  -0.027,  only  about  one-tenth  of  a standard 
deviation  from  that  expactad,  the  standard  deviation  is  0.862,  meaning 
that  the  intervals  are  conservative.  This  is  expected,  since  from  (2.27), 

EgSwJ^(X)  ■ VEgR*  >Eg£  (6j  - 6^)^  (2.28) 

A 

and  so  the  o^(X)  tend  to  be  too  large  (by  about  15  percent  in  tnls  ease). 
For  comparison,  the  distribution  of  errors  of  X^,  relative  to  the  stan- 
dard deviation  of  X^,  is  given  in  column  4 of  Table  2.  The  mean  and 

standard  deviation  of  these  numbers  are  almost  exactly  what  is  expected 
from  a sample  of  18  numbers  from  a N(0,1)  distribution.  Hence,  the  in- 
tervals for  in  this  example  are  both  shorter  and  more  conservative 

than  those  for  X^. 

The  signs  of  the  MLE  errors  in  colump  4 are  strongly  correlated 
with  the  values,  because  the  true  means  9^  have  ragrassed  markedly 

toward  the  mean,  relative  to  ths  observed  means  X^.  Figure  1 shows 

this  regression  affect  vividly,  and  how  the  [8^}  shrink  ths  (X^]  to 

produce  batter  estimates.  The  dispersion  of  the  (8^}  is  even  smaller 

than  that  of  the  true  values  (6^}  since  the  ordering  of  the  [8^}  is  not 

highly  correlated  with  that  of  the  [X^}  (Spearman's  rank  correlation 

These  really  are  only  the  batting  averages  for  the  remainder  of 
the  1970  season,  being  independent  estimates  of  the  true  values  with 
standard  deviation  0.0659  (45/N^)  , given  in  column  7 of  Table  2. 
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coefflcltnt  Is  only  p(e»  X)  ■ 0.218  for  theso  data).^  The  ragreaalon 
to  the  mean  effect  alio  occurs  for  the  BB  estlnataa  6^  In  column  3, 

although  It  la^much  lass  pronounced.  An  even  leae  conservative  shrinking 
constant  than  B would  be  needed  to  elinlnate  the  ragreaalon  to  the  naan 
for  theta  eatlmatas  and  true  values. 


Table  2 

TRUE  VALUES,  RELATIVE  ERRORS,  AND  LOSSES  FOR 
EMPIRICAL  BAYES  ESTIMATES  (EBE)  AND 
MAXIMUM  LIKELIHOOD  ESTIMATES  (MLB) 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

TRUE 

EBE 

MLE 

EBE 

MLE 

VALUE 

REUTIVE 

RELATIVE 

LOSS 

LOSS 

ERROR 

ERROR 

ei 

®l“®l 

Xi-Si 

N 

1 

_____ 

V 

V 

1 

0.346 

-0.831 

0.744 

0.339 

053 

367 

2 

0.300 

0.026 

1.138 

0.000 

1.295 

426 

3 

0.279 

0,365 

1.153 

0.057 

1.330 

521 

4 

0.223 

1.560 

1.684 

0.968 

2.837 

275 

5 

0.276 

0.124 

0.561 

0.006 

0.315 

418 

6 

0.273 

0.198 

0.607 

0.015 

0.368 

466 

7 

0.266 

0.198 

0.379 

0.014 

0.144 

586 

8 

0.211 

1.406 

0.880 

0.716 

0.775 

138 

9 

0.271 

-0.286 

-0.364 

0.030 

0.133 

510 

10 

0.232 

0.694 

0.228 

0.175 

0.052 

200 

11 

0.266 

-0.343 

-0.637 

0.044 

0.406 

277 

12 

0.258 

-0.145 

-0.516 

0.008 

0.266 

270 

13 

0.306 

-1.334 

-1.244 

0.668 

1.548 

435 

14 

0.267 

-0.368 

-0.653 

0.051 

0.426 

538 

15 

0.228 

0.598 

-0.061 

0.134 

0.004 

186 

16 

0.288 

-1.054 

-1.335 

0.439 

1.783 

558 

17 

0.318 

-1.903 

-2.170 

1.540 

4.709 

408 

18 

0.200 

0.609 

-0.789 

0.174 

0.623 

70 

MEAN 

0.267 

-0.027 

-0.022 

0.299 

0.976 

369 

STDBV 

0.037 

0.862 

0.988 

0.412 



m 

The  observations  of  the  preceding  paragraph  are  exprosacd  differently 
In  Figure  2.  The  central  daahed  line  Is  the  maximum  likelihood  estimator 
(the  45  degree  llne)t  the  other  four  dashed  llnss  are  the  MI.E  plus  or 
minus  1.00  and  1.96  standard  deviations  of  X..  Theaa  determine  the  classi- 
cal 66  percent  and  95  percent  confidence  intervals.  Each  player  is  plotted 
at  hie  point  (Xj  e. ) , The  dashed  confidence  bands  do  very  well)  12/18  of 
of  the  true  values  ire  located  between  the  16th  and  84th  percentiles { and 
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EBE,  97.5  th  dtc  tnt  i U 
EBE,  B4th  p^rcintlU.^^' — 

EBE/  16 th  pwointll#'*'''*'’’*^^*^^ 

/ 

ESE,  2,5th  p«ro«ntll# 


Fig.  2 — MLE  Xi,  MLE  tV%  and  MLEtU96  V'’  (daihod  lln«) 
and  EBE»9^/  EBE±(ri*(X),  and  EBE  ±1 .96  o'ClX)  (lolld  curvai), 
Elghtean  playen  plotted  at  (Xi , 9\ ) uelng  data  of  Tablet  1,  2 
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17/18  arc  within  the  95th  confidence  band.  The  middle  aolld  line  la 
the  empirical  Bayea  estimator  Oi  ■ 0.17941  + 0.32466  This  value 
+ a^(X)  la  Intended  to  correspond  approximately  to  68  percent  con- 
fidence, and  ± 1.96  cj (X)  to  95  percent  confidence.  Notice  that  these 
solid  line  confidence  nende  curve  to  allow  for  greater  errors  at  extrema 
values  of  Xj.  The  confidence  banda  are  eoneervatlve  In  this  application. 

In  congruence  with  the  theoretical  statements  made  after  (2.27) | 13/18  •• 
0.722  of  the  true  values  are  In  the  central  68  percent  confidence  region, 
and  all  18  are  In  the  95  percent  region. 

An  extremely  Interesting  point  raised  by  Figure  2 le  that  when  the 
95  peroant  confidence  region  Is  used  to  make  a statistical  teat  that  the 
true  value  of  a player  is  a specified  value,  then  conflicting  results  can 
be  obtained  from  the  cltsslcel  and  empirical  Bayae  methods.  Because  It 
has  shorter  Intervals,  we  expect  the  empirical  Bayes  methodology  to  reject 
certain  true  values  when  the  MLS  does  not.  For  example,  from  Figure  2, 
a 0.500  season  average  cannot  be  rejected  for  player  number  1 according 
to  olasslcal  theory,  but  la  out  of  the  question  from  the  empirical  Bayes 
standpoint.  (No  one  has  aver  approached  such  a value  for  a full  season.) 

The  astonishing  fact  Is  that  the  empirical  Bayes  method  Includes  two  small 
regions  that  era  excluded  by  the  olasslcal  methodology.  To  Illustrate 
this,  a true  value  of  0.318  Is  rejected  at  the  95  percent  level  for  player 
nuitber  17  (Thurmond  Munson)  by  the  classical  test,  but  la  not  rejected  at 
the  same  level  using  empirical  Bayes  Intervals  In  Table  2.  It  turns  out 
that  0.318  waa  Munson's  true  value.  (And  In  1976  he  was  voted  the  most 
valuable  player  In  the  American  League  I ) We  will  not  discuss  this  hy- 
pothesis testing  problem  further  hare,  but  obviously  it  is  a worthy  topic 
for  further  research. 

« Columns  (5)  and  (i)  show  Che  losses  Incurred  by  the  two  estimators 
6^  and  X^.  Only  for  the  10th  and  15th  players  does  fall  to  Improve 

on  X^,  and  In  Chose  cases  the  loss  Is  small.  The  empirical  Bayes  loss 

£(%1-  6j^)^/V  for  Che  ].8  players  Is  5.38.  The  sample  means  give  17.57, 

close  to  what  le  expected  for  18  conmonents,  but  worse  by  a multiple  of 
3.27  than  5.38.  The  valuea  and  R|  from  Table  1 estimate  the  expected 

value  of  entries  in  colum  (5)  of  Table  2.  Since  R.  - 4.94  and  R , ■ 7.15, 

^ is  closer  to  the  oosiblned  loss  H(e^  - e^)  /V  ■ 5.38.  Howavet  the  R^ 

values,  being  smoother  estimates  of  E(8.-  e.)^/V,  are  much  closer  to  the 

m2  1 * 

Individual  losses  (8^-  0^)  /V  of  the  players  than  are  the  R^. 

Do  these  results  hold  up  for  other  samples  {X^]  from  these  true 
values  {8jl?  A simulation  was  conducted  to  check  chls  and  to  determine 

*•  A A A A 

whether  fhe  Intervals  computed  hy  8^  jn  o^(X)  and  8^  ± 1.96  o^(X)  contain 

the  true  values  at  least  68  percent  and  95  percent  of  the  time.  Using 
the  same  true  values  {8^}  of  Table  2 each  time,  new  values  of  (X^^,  ....  Xj^g) 

were  randomly  drawn  from  the  normal  dlntrlbutlon  (2.1)  one  hundred  times, 
with  Var(Xj^)  ■ (0.0659)*  in  all  cases. 
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In  th«  1800  ttxp«rl«nc*t,  tha^true  valiws  vara  contalnad  In  thalr 
nominal  68  parcant  Intarvala  (In  8^  ± c^(X))  74  parcant  of  tha  tlma,  and 

in  the  nominal  95  percent  Intervale  97.3  percent  of  the  time.  In  one  of 
the  100  caaes  three  of  tha  true  valuaa  fall  outside  their  nominal  95  per- 
cent intervale t In  nine  oasae  two  true  values  fall  outside,  in  28  oases 
ona  fell  outside,  and  in  tha  rauining  62  eases  all  18  of  tha  true  values 
wars  in  tha  interval  6^  ± 1.96  o*(X).  The  average  shrinking  value  B was 

0.608,  and  or*(X)  was  typically  65  percent  of  V^,  so  empirical  Bayes  con- 
fidence Intenrals  ware  both  35  percent  shorter  and  uora  conservative  than 
those  baaed  on  tha  sample  mean. 

Tha  estimate  {X^}  had  average  loss  18.45  (0.75  of  a standard  deviation 

above  that  expected),  while  {6^}  has  6.41,  more  efficient  than  the  MLB  by 

a factor  of  2.88.  In  no  caae  did  {%.]  have  combined  loss  exceeding  13.1,  and 
Its  total  loss  never  exceeded  60  percent  of  that  of  [X^]  In  any  of  the  100  cesas. 

Next  eonelder  the  estimates  of  risk.  The  risk  of  {8^}  is  oloes  to 
6.4,  the  average  lose  in  the  100  aimtilations.  The  average  value  of  ft. 

^ A ^ 

was  6.3,  averaged  8.3.  However  was  a bettar  estimata  of  the  total 

loss,  which  varied  from  case  to  ease,  than  R^  in  59  of  the  100  oases,  and 

had  root  maan  squared  error  2.9  for  estimating  the  total  loss 

a I(8j^  - 0^)2/V  (i.e.,  E(Rj  - L^.)2/100  ■ (2.9)“  for  the  100  cases), 

whereas  R^  had  an  inferior  root  naan  squared  error  of  3.7.  The  component 

estimates  of  risk  r!*  were  much  better  than  ft. , as  estimates  of  the  loss 

^i  * ~ square  sense,  |R^  - L^|  averaged 

0.51  while  |R^-  L^I  typically  wae  0.78.  The  latter  errors 

also  wore  more  variable  from  problem  to  problem.  In  only  9 of  the  100 
cases  was  the  root  moan  sqiwre  of  the  18  jft^-  L^|  values  smaller  than 

the  root-mean-square  of  the  |R*-  L^{  values. 

Tha  analysis  presented  in  Tables  1 and  2 then  is  typical  (although 
slightly  on  tha  favorablo  side)  of  what  would  be  expected  from  a random 
draw  of  observed  values  {X^J  from  the  true  values  {8^}  of  Table  2.  The 

conclusion  from  the  simulation  for  these  {e^}  is  that  in  addition  to 

aubatantlal  improvemant  in  the  risk  of  the  sample  maons,  the  ompitioal 
Bayes  estimates  (6^}  of  (2.9)  provide  much  shorter  confidence  intervals 

than  tha  classical  estimator,  with  nominal  values  that  are  conservative. 

Va  cannot  maka  similar  claims  at  this  time  for  tha  oonfidance  in- 
tervals generated  by  the  empirical  Bayes  estimator  for  othsr  combinations 
of  true  valuaa,  but  there  is  reason  to  expect  similar  results  if  tha 
statistician  is  careful  to  combine  estimates  from  problems  for  which  tha 
trua  values  are  axehangeable  (i.e,,  the  distribution  of  the  {8^}  should 


b*  invariant  undar  parautationt) . For  larea  tha  rula  (2.9)  la 

naarly  aqual  to  the  Janaa-Stain  aatliutov,  which  Stain  (1962)  haa  ahown 
laada  to  approxiniataly  correct  confidence  aata  whan  either  S or  k la  large. 
Over  all  conponenta»  (2.9)  la  ninlmax»  and  conaarvatlva  both  becaune  It 
ahrlnka  laaa  than  the  Jaiaea-Stain  rulat  and  bacauaa 


< 1. 


(2.29) 


which  followa  from  (2.27).  But  the  atatiaticlan  who  earaa  about  each 
individual  oomponant  really  naada  to  know  not  that  (2.29)  holda,  but 
that  for  avary  i ■ . . . i k» 

(Vep^ 

Bfl  — S 1.  (i-30) 

® oJ^(X) 


•‘It  nearly  ao.  Tlila  can  fnil  badly  if  the  true  valuan  fall  into  dlatinct 
groupa  (ao  they  could  not  have  coma  from  tha  axchani;aabla  prior  on  which 
(2i9)  ia  baaad).  Tha  tuat  dramatic  axamplo  of  t^ia  failure  oeeura  for 
larga  k whan  ■ . . , ■ and  ■ Sj,  + JinV,  Than,  although 

6^^(X)  ^ V fur  i ■ 1,  Bg(§^-e^)  * Vk/4«  However  tha  unblaaed  aatimata 

of  rlak  of  0^,  will  be  cloae  to  tha  correct  value  k/4  and  therefore 

ia  a much  batter  aatimata  of  rlak  than  R.  in  thia  Inatanca.  Mora  gonarally, 
e 

an  upper  bound  for  ia  l.S  for  all  k,  X,  achieved  for  ••  1 near  S - 2k. 
Thua  Oj^(X)  l.lZiJi  alwaya,  raaulting  in  nonoenaarvatlva  Intarvala  for 


oomponenta  that  are  badly  aatlmatad.  A limited  tranaiatlon  modification 
of  tha  aatimator  (2.9)  would  reduce  thia  error  algnlficantly  without 
aubatantially  reducing  the  overall  efficiency  of  the  eatimate  (EfrongMorria, 
1972).  Obvioualy,  oonaidarable  caution  muat  be  takan  whan  applying  6.  to 

A * 

oomponenta  with  larga  or  large  valuaa.  Thia  example  wama  againat 

too  much  raliance  on  the  Bayeaiaa  intaipratatlon  of  the  aatimator  and 
llluatratea  why  the  atatiaticlan  muat  conaider  tha  oxchengeability 
aaaumptlon  to  ba  plaualbla  before  uaing  cither  the  Jamaa-Stein  aatimator 
or  (2.9). 


3,  pBRIVATIOH  OF  THE  EMPIRICAL  BmS-ESIIMATflR.  The  Jamaa-Stelu 
rule  (2.6)  may  be  derived  aa  an  empirical  Bayea  aatimator  (aaa  Efron- 
Morria,  March  1973,  and  Bfron-Horria,  1979)  by  aaaumlng  that  the  true 
valuea  £9^}  Independently  follow  tha  aaaa  prior  dlatribution  with  two 

unknown  paramatera  p “ E6^,  A ■ Var(6^), 

N(^i,  A)  1-1,2 k.  (3.1) 
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Given  {6^},  Che  Bampla  meena  have  the  normal  distribution  specified  in 
(2.1).  If  and  A are  known • the  Bayes  astlnator  of  6^  for  squared  error 
less  li  the  posterior  mean 

S<e^lX,  A,  n)  - + (1-B)  (Xj-I*) , (3.2) 

defining 


The  marginal  distribution  of  {X^}  given  p.|  A Is  obtained  by  Integrating 
out  of  the  oonditlenal  distribution  (2.1)  of  {X^}»  obtaining 

N(n,  V + A)  i - 1,  2,  ...,  k.  (3.4) 

Thus  X la  the  usual  estimator  of  p,,  from  p.4)i  and  can  be  used  to  replace 
the  unknown  p,  in  (3.2),  While  S a £(X^-1c)^/V,  being  distributed  as 

S ~ ^ ^“1*  (3»5) 

beoause  of  (3.4),  provides  a basis  for  estimating  B.  The  unbiased  estimate 
of  B is  Bjg  ■ (k-3)/8, 

BB^g  ■ - B (3.6) 

from  (3.5).  Substitution  of  X and  Bjg  for  the  unknown  values  p and  B in 
(3.2)  yields  the  James-Stsin  estimator  (2.6)  of  {6^}  as  an  empirical  Bayes 
estimator. 

Instead  of  the  unbiased  estimate,  we  will  derive  a formal  Bayes 
estimator  of  B by  assuming  A Is  uniformly  distributed  on  [0,  •) , that  Is, 
with  probability  elemant  dA  on  [0,  «).  A compelling  reason  for  this  choice 
is  that  the  James-Stsln  estimator  Is  the  formal  Bayss  estimator  resulting 
from  distributing  A uniformly  on  [-V,  «) . Since  it  is  known  that  A cannot 
be  negative,  being  a variance,  restricting  it  to  [0,  cd)  leads  to  an 
estimator  similar  to  but  better  than  the  Jamas-Steln  estimator.  This  prior 
has  been  studied  before  with  p known,  by  Stein  (1962),  by  Baranehlk  (1964) 
who  proved  the  resulting  estimator  is  mlnimax,  and  again  by  Stein  (1973) 
where  he  developed  the  unbiased  estimator  ft.  of  its  risk  and  also  observed 
that  the  rule  is  admissible  because  of  a theorem  of  Brown  (1971).  Leonard 
(1974)  dlacuased  the  prior  In  a Bayesian  setting,  and  it  is  similar  to,  but 
not  identical  with,  priors  recommended  by  other  Bayeslnns  for  this  problemi 
Jeffreys  (1946),  Llndlsy-Smlth  (1972),  iSellnar-Vandaale  (1975),  and  Good 
and  Wallace  (as  Interpreted  by  Stein  U962,  p.  2B1).  An  appealing  property 
of  this  prior  is  that  it  does  not  depend  on  the  variance  V.  The  cstimatocs 


Z3.1 


.d  f .IJ. 


of  StrawdatiMiti  (1971)  do  not  aharo  this  proparty,  which  renders  them 
Inapplicable  In  the  context  of  Section  4. 

Using  the  density  from  (3.S)  end  dA  ■ ~*/dB/6^,  the  density  of  B 
given  S Is  proportional  to 

f(Bl8)  - sxp(-BS/2)dB  (3.7) 


with  m > (k>3)/2  on  0 < B < 1.  Therefore  the  formal  Bayes  estimate  of 
B Is 


^ exp(-B8/2)dB 

ft  ■ EBlS  - . 

r exp(-B8/2)dB 

* A 


(3.8) 


The  dsnomlnator  of  (3.8)  la,  up  to  a scaler  multiple,  the  marginal 
density  function  of  S (being  an  improper  donelty) , Infiegratlng  the 
numerator  of  (3.8)  by  parts  once  yields 

- I exp (8/2)  + J‘^B®"^  sxp(-B8/2)dB 

and  hence  (3.8)  simplifies  to 

s - - dsr> 


wltn  ej||(8)  defined  in  (2.10),  Bstlmatlng  |i  by  X and  B by  (3,9)  In  (3.2) 
yields  the  estimator  (2.9)  as  an  empirical  Bayes  estimator. 


The  variance  v of  B given  8 also  can  be  obtained.  We  have 

V ■ Ver(BlS)  * “2  |g  EBlS  - -2  ~ B.  (3.10) 

Since 


--fg—  - ejjj(S)(l-fi)/2,  (3.11) 

it  follows  from  (3.10),  (3.9),  and  then  (3.11)  that 

V - 2[ft  - (l-B)m/e^(8)]/B.  (3.12) 

The  unbiased  estimate  of  component  risk  of  any  estimator  of  6^  of 
the  form  6^  ■ X + (l-B(B))  (X^-X)  is,  denoting  B' (8)  ■ dB(8)/dS  and 
- l(X^-X)^/SV, 
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(3.13) 


R^(S)  - 1-2  ~ B(S>  + Pj^S[B^(S)  - 4B'(S)] 

for  any  shrinking  function  B(S)  which  dspsnds  only  on  8.  That  is 

EgR^(S)  - 

This  follows  from  writing 

E <Sj-ei)*/V  - ■ B(8)(X^-X)3^/V 

^ X -6 

- 1 + EgB^(S)(X^-X)^/V  - 2Eg-^  B(S)CX^-X), 

and  than  sopljing  Stein’s  formula.  Stain  (1973),  Efron -Morris  (1976),  to 
obtain  the  Idantlty 


(3. IS) 


Eormula  (3.13)  is  obtsinad  by  computing 

B(S)(X^-X)  • B»(S)(Xj^-X)  ||-+  B(8)().-l/k), 

^4  1. 


noting  that  aS/aX^  - 2(X^-X)/V,  and  collecting  »;enas.  The  amprasslon 
(2.22)  for  follows  from  substituting  (3.10)  and  (3.9)  into  (3.13). 

It  la  Interastliig  to  note  if  B(8)  in  (3.13)  lo  any  Bayes  estimator 
of  B,  conmuted  as  B (S)  - E bIS  with  n the  prior  dansl*-y  of  A,  then 

Var  (bIs)  - -2dB  (8)/dS  and  the  unbiased  estimate  ^3.13)  of  risk  bee jmss 

TT  ‘ TT 


t 


'l»Tt 


1_2  B (S)  '’4  8[B^(S)  + 2 Var  (B|S)].  (3.16) 

kn  i n n' 


To  compute  the  posterior  distribution  of  {6^3  Slvan  the  data  (X^^}, 

we  need  a prior  distribution  for  n in  (3.4).  This  distributl^  is  chosen 
to  bo  Lebesgua  (uniform)  measure  on  (-«»,  m) , independent  of  tre  distribution 
on  A,  because  it  leads  to  the  classical  estimate  X for  u.  Assuming  the 
normal  distributions  (2.1)  and  (3.1)  for  {X^3  given  (e^)  and  {6^}  given 

p,  and  A,  Bayes  theorem  gives 


pjX,  A ~ N(X,  ^). 


(3.17) 


To  extend  the  result  (3.2),  denote  0 » (6j^,  ...»  0j^)  ’ » ^ •••*  * 

e - (1,  1,  ....  1)'.  and  I the  kxk  Identity  matrix.  The  distribution  of  6 is 
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el(X,  n.  A)  ~ + (l-B)(X-tic),  V<1-B)I).  (3.18) 

Integrating  the  distribution  of  jx  (3.17)  out  of  (3.18)  yields 

e|(X,  A)  .w  Sj^(5ta  + <l-B)(X-3le)*  V(l-B)l  + B J ae*).  (3.19) 

Finally,  the  distribution  of  A ■ V(1*B)/B  given  X Is  given  by  (3.7).  So 
Integrating  B out  of  (3.19).  using  (3.7).  yields 

B(e|x)  - 5a  + (i-i)(x-5e).  (3. 20) 

with  B given  by  (3.9).  and  the  conditional  covariance  matrix  as 

I Cov(elx)  - (l-B)l  + Bee’ /It  + v(X-Xe)  (X-Xe)  * /V  (3.21) 

with  v - Var(BlS)  given  by  (3.12). 

It  la  not  precisely  true  that 

elx«  Nj^(Ee(x,  cov(elx))  (3.22) 


has  the  normal  distribution.  But  6 does  have  a normal^dlstrlbutlcn  for 
every  fixed  B (3.19),  and  If  either  B la  estimated  by  B without  large 
variance,  or  If  the  normal  distribution  of  X Is  considered  as  an  added 
source  of  variation  In  (3.20),  then  the  normal  distribution  should  hold 
approximately  in  (3.22).  We  assumed  this  to  produce  Interval  estimates 
In  Section  2.  Formula  (3.21)  actually  shows  that  the  {8^}  values  are 

correlated,  a fact  not  mentioned  or  used  In  Section  2.  Thus  (3.21)  could 
be  used  to  find  posterior  credibility  ellipsoids  for  6 given  X.  Instead, 
Section  2 uses  only  the  diagonal  elements  of  (3.21) 

2 

oj  (X)/V  + P^vS,  (3.23) 

and  ignores  the  covariance. 

In  soma  problems  the  prior  mean  (i.l)  may  be  known,  and  then 
(3.17)  would  be  Inappropriate.  All  the  results  given  so  far  cover  the 
case  of  known  ^ provided!  X Is  replaced  by  p,  throughout;  k-1  and  k'-3  are 
changed  to  k and  k-2  In  (3.5),  (3.8);  m la  changed  to  (k-2)/2  throughout 
(this  Is  the  reason  for  using  the  subscript  m on  e_  In  (2.10));  (k-l)/k 

U 

Is  replaced  by  1 In  the  mlddln  terms  of  (3.13),  (3.16),  and  (3.23);  (3.19) 
Is  Ignored  In  favor  of  (3.18);  and  the  ee'/k  term  In  the  middle  of  (3.21) 
Is  eliminated. 

As  stated  before,  the  J^es*Steln  rule  Is  a formal  Bayes  estl~ 
mator  against  the  prior  taking  A uniform  on  [>V,  n) . That e fore 
the  risk  estimates,  Interval  estimates , posterior  distributions,  and  all 
other  quantities  computed  in  this  section  can  be  computed  for  the  James- 
Steln  estimator.  These  results  ace  obtained  by  replacing  e|||(8)  by 


*■ 


V 


Infinity  (1/e  (S)  ■ 0)  In  all  formulas.  Recall,  for  example,  that 

IR  ^ 

I - B,„(l-l/e  (S)),  so  setting  e^(S)  - ® modifies  B to  B^-*  More 
«JS  o ® “ 

ganerallyt  If  the  prior  takes  A uniform  on  [o,  «) • a > -V,  then  the 
resulting  value  of  S is 


fi  > EB|S  « ^5|S(1  - TTpsJ-) . e ■ ^ • 


(B.24) 


The  Jamea-Stein  rule  is  obtained  by  letting  B •«  » In  (3.24) • hence 
t (BS)  -«  «*.  The  estimator  of  this  paper  is  given  by  a - 0(B  ■ !)• 

Ift 

vhlls  other  more  conservative  estimators  result  from  choices  of  Of  > 0 

(p<i). 

4.  FORMAL  BAYES  ESTIMATORS  IN  THE  ITOEQUAL  VARIANCES  CASE.  Because 
of  the  Buceass  in  previous  sections  of  the  formal  Bayes  estimator  result- 
ing from  the  prior 


e^|A  A) 


i**  1,  2,  ...|  k 


(4.1) 


with  the  variance  A distributed  as 


A ~ Uniform  (0 , «)  • 


(4.2) 


we  use  this  prior  again  in  the  caae  where  the  variances  of  the  eemple 
means  are  not  nacaaaarlly  erjual.  That  ia,  (2.1)  ie  ganerallsad  to 

^ll®l  *'^®1’  V 1 ■ 1.  2.  •••.  h (4.3) 

with  the  V.  known,  but  possibly  unequal . Thia  is  the  case  that  arises 
most  frequently  in  applications.  The  equal  variance  situation  zaraly 
occurs,  excapt  in  soma  deelgned  experiments.  We  shall  aaeume  that  the 
{pu}  are  known,  because  while  they  can  be  eetlmated,  doing  so  causae 
the  formulas  of  this  section  to  become  much  mors  conplicated  without  pro- 
viding much  additional  insight.  In  most  applications,  however,  estimating 
the  {|i.}  would  ba  worthwhile.  Having  assumod  [p.]  known,  we  take  them  to 
be  aero  without  eseential  loss  of  generality,  ana  replace  (4.1)  with 

e^lA  ^5**  N(0,  A),  1 - 1,  2,  ...,  k.  (4.4) 

By  making  uaa  of  Bayea'  formula,  and  by  obtaining  the  merglnel  distri- 
bution of  {X^},  (4.3)  and  (4.4)  are.  equivalent  to 

e^l(X^,  A)^"**  N((l-B^)Xj_,  V^(l-B^)),  1 - 1,  2 k (4.5) 

Xj|A  N(0,  A + V^),  1 - 1,  2 k (4.6) 


I 


where  we  have  defined 


®1  “ • 


(4.7) 
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Letting  It  followe  from  (4.6)  that 


SjA^S^I-xJ  1-1,2 k. 


(4.8) 


The  poiterlor  dlitrlbutlon  of  A may  be  obtained  from  application  of  Bayes' 
formula  to  (4.2)  and  (4.6),  or  more  simply  to  (4.2)  and  (4.8),  to  obtain 
the  poatarior  probability  element  cf  A given  X^,  . . . , 2^ 


f«(A)dA 


exp  (-A  E [B^S^  - log(Bj)])dA 

‘ > 

•0  k 

r exp(-  4 E [B.S,-  log(B,)])dA 
0 ^ j-1  J ^ J 


(4.9) 


with  each  B^  a function  of  A given  by  (4. 7) . 

Formulas  (4.5)  and  (4.9)  auM&atise  all  information  relevant  to  a 
Bayesian  analyais.  In  particular 


n Ee^jX  • (l-Bj^)Xj 


(4.10) 


kj  * ^ Var(e^lX)  - 1-5^  + Sj^Vj, 


(4.11) 


whara  we  have  defined,  with  Bj^  - ®1^^^  given  by  (4.7), 

B^  ■ EB^lS  - J*  B^(A)fg(A)dA 


(4.12) 


• Var(a^lB)  - Bj(A)fg(A)dA  - ftj. 


(4.13) 


Although  there  ero  many  methods  and  some  tricks  to  help  in  confuting 
the  integrals  U.12),  (4.13),  none  yield  simple  enswers  like  thoso  of  the 
preceding  sections.  The  simplest  way  to  confute  (4.12)  and  (4.13)  we  have 
found  so  far  ia  to  evaluate  the  numerator  of  (4.9)  at  a nuafcer  of  points 
(about  100,  not  equally  spaced),  then  to  divide  tUess  values  by  thair  sum 
to  obtain  fft(A)  at  those  points,  and  finally  compute  the  2k  integrals  (4.12), 
(4.13)  as  finite  sums.  This  is  e minor  task  using  a computer.  More  thought 
should  be  given  to  these  computational  issues  if  the  method  is  used  fre> 
qucntly. 

The  symbols  (4.10)  and  (4.11)  arc  the  same  aa  those  uaad  in  Sections  2, 
3,  and  retain  their  muanlngs  (except  that  here  is  not  estimated) . So  do 


iifiiiiiiesiiWs 


i|  baing  darlvad  by  eha  aaiM  argumane 

' i ftj  aatlafiaa 

1 * 

;■(  ■ ‘e'*i 


uaad  to  obtain  (3.13)  and  (3.16).  Than 
e^)*/V^  (4.16)  ] 


for  avary  (6^^, 


®k> 


For  Illustration,  thaaa  aatlmataa  ara  conputad  on  tha  alght  obaar-> 
vatlona  glvan  In  Tabla  3.  Tha  vatlancaa  Vi  hava  unit  gaonatrlc  naan,  and 
nearly  Incraasa  by  a factor  of  two  (actually  1.9921)) aaeh  tins,  loading  to 
nax(Vj)/£  V.  ■ Tha  data  and  trua  valuaa  6^  ara  fictitious,  but  ara 
sarnfully  onosan  functions  of^tha  aquara  roots  of  tha  16  axpactad  squarad 
N(0,  1)  ordor  ■tatlstlca  ao  thatt  (1)  tha  {6^}  look  Ilka  a aanplo  from 

N(0,  1) (hones  A - l)t  (11)  tha  valuaa  (X^<e^)/V^  look  Ilka  a N(0,  1)  aampla 

with  £(X^-e^)^/V^  and  £(X^>6j^)^  naarly  aqual  to  thalr  axpactad  valuaa; 

(111)  £(0^-X^/(l+V^))^  and  Ke^-Xj/d+V^))^  ara  naarly  aqual  to  thalr 

conditional  axpactad  valuaa  glvan  (X^/(1-4>V^)  la  tha  Bayaa  aatlnator  of 

If  A"1  la  (4.3));  and  (Iv)  tha  thros  squarad  corralatlona  botvoan  tha 

pairs  (0^,  log(V^)),  (log(V^),  (X^-Gi)/vJ),  and  (0^,  (X^-0^)/;^  hava  boon 

controllad  to  ba  naar  thalr  axpactad  valuaa,  l/(k-l)  ■ 0.143.  Tha  aao^la 
la  oallad  "aurprlaa-fraa"  for  obvious  raaaona.  Such  a aampla  la  daalrad 
bacauio  tha  purpoao  of  this  aootlon  la  to  lllustrats  tha  nathoda  on  only 
ona  data  sat,  while  wa  hops  tha  raaults  will  typify  mora  ganeral  axparlanca. 

Tha  data  and  trua  valuaa  appsfr  in  ooluan  (1),  (2),  (3),  (11)  of 
Tabla  3.  Tha  amount  of  shrinking  , column  (4) , Incraaaaa  aharply  as 
Inoroaaaa.  Tha  values  A^,  dafinid  in  conparlaen  to  (4.7)  by 


*1  " ' 

also  Incrooao,  naarly  llnaarly  In  tha  standard  deviation  (V^)^. 


(4.17) 
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Th«  "tru«  value"  of  A Is  1.000,  and  £6^/k  ■ 1.752,  so  all  these  values 

of  are  conservative,  although  for  small  they  are  slightly  leas  con- 

sarvative  because  thoae  exponents  get  higher  relative  weight  when  esti- 
mating A>  The  estlmatea  and  their  Bayesian  standard  errors,  which 

Increase  with  appear  In  columns  (6)  and  (8).  The  differ  little 

from  for  small  but  are  shrunk  considerably  for  large  V^.  As  usual 

In  the  unequal  variances  situation,  the  amplrloel  Bayes  estimates  order 
the  true  means  differently  than  the  sample  means  do  (the  4th  and  8th  cobh> 
ponents  are  reordered).  This  has  Important  Implications  for  the  theory 
of  ranking  and  selection. 

A 

The  Bayesian  estimate  R.  of  the  ratio  of  tha  mean  squared  error  of 
relative  to  that  of  Is^glven  In  column  (9).  These  values  avarage 

0.773,  a quantity  one  cares  ebout  If  the  loss  function  la  ^ 

The  square  root,  0.879,  la  tha  avarage  ratio  of  confidence  Interval  widths, 
although  little  Improvement  over  the  sample  mean  la  possible  for  components 
with  small  and  much  for  large  V^.  The  unbiased  estlmatea  of  risk,  11^, 

appear  in  column  (10),  averaging  0.790,  slightly  hlghsr  than  the  Rj  averags. 
All  the  quantities  In  colunms  (I)-(IO)  can  and  should  be  computed  whan 
utilising  these  estimates. 

The  "true  values"  appear  in  column  (11).  The  relative  errors  of  the 
estimate  64,  given  In  column  (12),  have  root  mean  square  of  0.788,  much 
less  than  the  nominal  value  1.000.  Thus,  confidence  intervals  based  on 
Oj(X)  would  be  conservative  In  this  example.  The  weighted  squared  errors, 

* A A 

whose  expectations  are  estimated  by  R.  and  R.  in  columns  (9),  (10),  appear 
in  column  (13) . The  sum  of  the  values  In  column  (13) , corresponding  to 
the  loss  function  £(6^-e^)2/V^  Is  4.26,  while  if  X^  Is  used,  8.00  (the 

expected  loss)  is  obtained.  For  squared  error  loss,  £(64'*6j)^*  4.87 
7 11 
While  £(X^-e^)  ' ■ 22.56  (the  expectation  of  this  last  quantity  Is  22.32). 

The  values  of  the  shrinking  coefficients  (Table  3,  column  (4)) 
are  plotted  In  Figure  3 against  log(V^),  Which  Is  llnsar  In  1.  The 
amount  of  shlnklng  increases  sharply  as  inersasas,  but  not  as  much 
as  tha  shrinking  coefficient  B^  ■ V^/(14V^)  for  the  Bayes  estimator 
(l-B^)X^  which  would  be  used  In  (4.5)  if  A were  known  to  bo  equal  to  1. 

The  value  A of  A that  maximises  fg(A)  in  (4.9)  la  X - 2.345,  being 

Che  maximum  likelihood  sstlmate  of  A based  on  the  joint  distribution 
(4.8)  of  (Sj^,  Sj,  ...,  Sj^) . Use  of  this  in  the  Bayes  estimator  (4.5) 

yields  the  empirical  Bayes  estimator,  labeled  "EBMLE"  In  Figure  3.  As 
Figure  3 Illustrates,  this  shrinking  value  B^  ■ V^/(2.345  V^)  is  less 

conservative  than  B..  For  large  values  of  k It  should  be  nearly  equal 
to  B^ . 
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^1»0 


Bayat,  A*  1 

Bl-Vi/O+Vi) 


EBMLE,  A- 2.345, 

Bt-Vi/(A+Vi)' 


l«2  I“3 


/yfc* Fopinal  BQyei  wtlmator 

/ given  by  (4.12) 

Xe Jamii-Staln  «tlmotor 

"y—  Hudion-Bargtr;  B^ ■ ,01003 /V ^ 

' (mlntmax) 

■Maximum  llktilhoodi 


Log  (V \) 


Flo,3— ValuM  of  fho  ihrinking  ooofficlonf  Bi  * V^/(Aj+ V^)  for 
lovoral  Mfimafori  of  tho  ferm(1-B]^)  plofttd  ai  a funcHon 
of  tho  logarithmic  varianoo,  for  lurpriio-froo  dato  of  Table  3 


Th«  two  horlsontal  lines  at  » 0 (no  shrinkage)  and  > 1 (full 

shrinkage  to  the  prior  mean)  correspond  respectively  to  the  maximum  like- 
lihood eatlmator  and  to  the  eatlmator  that  Ignorea  the  data  and  aatlnates 
0^  ■ 0 in  every  caaei  The  other  estimators  compromiae  between  these  extramaa. 

The  Jamee-Stain  eatlmator • modified  for  the  unequal  varlancea  slti»- 
atlon,  has  constant  shrinkage  ■ 0.084  for  1 ■ 1,  2.  8.  This 

estimator  estimates  6^  by 


(«.U) 


being  ainimax  for  the  loss  function  £(l^-0^)^/V^.  It  is  derived  by  setting 

^1  * ^l^^i*  \ damea-Steln  estimator  (2.6)  to 

and  than  transforming  back.  Thaee  transformations  do  not  preserve  the  prior 
distribution  (4.4) • however,  so  the  resulting  astlmator  is  unsatisfactory 
if  the  statistician  thinks  a priori  that  the  (0.}  are  exchangeable.  The 
result  in  Figure  3 slightly  overshrinks  the  components  1*1,  2,  which  are 
well  estimated  by  X.  end  forfeits  the  big  improvements  possible  for  the 
components  with  largs  V^. 
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The  estimator  of  Hudson  (1974)  and  Berger  (1976),  which  estimates 


1 . .. 


(4.19) 


is  minimax  for  the  loss  function  £(&j^-e^)‘.  But  it  shrinks  less,  not  more, 

as  the  variances  increase,  and  therefore  can  hardly  shrink  at  all,  see 
Figure  3. 

This  is  the  price  one  pays  in  order  to  use  a minimax  estimator  In 
the  case  of  tmcqual  varlanonsi  almost  no  shrinkage  will  be  allowed  on 
those  components  that  are  not  well  estimated  by  Xj,  although  they  are 
precisely  the  cemponanta  where  shrinkage  is  naedsB.  Implicit  in  this 
statement  Is  another  assertion i estimators  that  are  empirical  Bayes 
against  exchangeable  prior  cannot  be  minimax  if  the  variance  of  some 
componant  is  largo  rslativo  to  ths  others.  A data  analyst  wishing  to 
improve  on  ths  maximum  likelihood  estimator  therefore  must  choose  between 
two  very  different  kinds  of  estimators.  Since  he  probably  is  more  able 
to  recognise  exchangeable  prior  distributions  than  to  rhoose  loss  functions 
(and  minimax  estimators  are  highly  sensitive  to  the  weights  L.  assumad  In 

*■  2 * 
ths  loss  function  £ L^(0^-0^)  ) he  gensrally  will  be  better  off  using 

empirical  Bayes  estimators.  This  approach  also  will  permit  him  to  identify 
many  situations  whan  he  should  stay  with  the  maximum  likelihood  eslmator. 


Tha  amplrlcal  Bayaa  approach,  conblnad  with  formal  Bayaa  theory, 
hai  ona  other  advantage  that  la  central  to  thia  paper.  It  provldaa 
a eeharant  natKod  for  oostputlng  intawal  aatlnataa  for  the  aatlaatad 
paraaMtara.  for  prtera  that  yield  aatlnatora  ainllar  to  tha  ona  of 
thti  papar.  thasa  Intarvala  promlaa  to  contain  tha  trtia  tMiana  in  aoat 
problama  tith  tha  apaeiflad  prebahllley  If  tha  true  naana  (64}  have  any 
orthogonally  Invariant  diatrlbutlon,  and  parhapa  will  do  ao  for  noat 
axohangaabla  prior  diatrlbutiona.  Xf  further  raaaareh  ahowa  thia,  data 
analyata  will  be  abla  to  tdantify  aany  aituationa  for  which  powerful 
altemativaa  to  tha  aanpla  naan  can  be  uaad. 
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ROBUST  STATISTICAL  PROCEDURES 


Robert  V.  Hogg 
Department  of  StatUtlcs 
The  Unlverelty  of  Iowa 
Iowa  City,  Iowa  528li2 


ABSTRACT.  Two  proposale  are  given  that  can  be  used  to  modify  the 
method  of  leaet  equares.  The  first  replaces  one  of  the  factors  in  the 
squaring  process  by  thu  rank  of  that  factor.  While  some  success  has  been 
achieved  in  applications  with  this  procedure,  the  computations  involved 
are  not  as  easy  as  with  the  second  method.  In  the  latter,  the  square  func- 
tion is  replaced  by  another  function,  say  p.  This  p function  can  be 
convex,  as  in  Huber's  M-estlmators , but  it  can  also  be  non-oonvex,  as  In 
the  descending  M'^estimatoro  of  Andrews  and  Hampel.  The  descending  M- 
estlmator  scheme  thus  requires  a better  preliminary  estimate  so  as  not  to 
find  the  "wrong"  solution.  Three  examples  using  real  data  are  considered. 

1 . INTRODUCTIUM . The  method  of  least  squares,  that  is, 

minimizing  i Uj  - I 

1“1  ^ j-1  ^ 

has  served  us  well  for  many  years  I But  there  now  Is  concern  about  the 
influence  of  "outliers"  as  they  tend  "to  pull"  the  solution  towards  them 
too  much,  Consequently  the  residuals  (if  they  are  even  considered)  are 
distorted  too  much,  and  accordingly  the  outliers  are  difficult  to  detect, 

Of  course,  the  situation  is  worse  if  the  investigator  blindly  takes  one  of 
the  many  packaged  programs  and  treats  the  answers  as  if  they  were  the 
"truth"  without  checking  assumptlona,  etc. 


Two  examples  sre: 
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While  the  first  is  one  that  I eonstruotedi  the  seoond  is  like  some  lumber 
data  that  Boardman  [3]  coneidered.  The  Investigator  of  that  project  at 
first  fit  the  least  sq,uare  linsi  and  later  Boardman  discovered  that  they 
vers  really  dealing  with  two  populations . 

To  see  exactly  what  can  be  gained  by  robust  methods • consider  the 
example  in  Chapter  3 of  the  book  by  Daniel  and  Wood.  This  concerns  the 
operation  of  a plant  for  the  oxidation  of  Atnmonla  to  Nitric  Acid.  There 
are  21  observations • in  which  the  3 Independent  variables  are  air  flow, 
cooling  water  inlet  temperature,  and  acid  concentration  while  the  stack 
loss  is  the  dependent  variable.  The  following  table  shows  the  "leaet 
squares"  betas,  the  "least  squares"  betas  with  four  bad  points  throwr.  out, 
and  two  sets  of  "robust"  betas  based  on  all  21  observations. 


ESTIMATES  OF  BETAS 


METHOD 

-IL. 

■Jj- 

Least  squares 

.72 

1.30 

-.15 

Least  squares 
(without  outliers) 

.80 

.58 

-.07 

M-estimates 
(Andrews ) 

.82 

.52 

-.07 

Nonpar ametrio 
(median  scores) 

.83 

.56 

-.06 
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While  Bome  of  the  details  of  the  latter  two  procedures  will  be  explained 
later,  please  note  that  they  give  eesentlally  the  sane  answer s using  all 
21  points  as  does  least  squares  after  H bad  points  have  been  removed t seem- 
ingly these  robust  aohemes  provide  a BIO  advantage  In  applications! 

2j NONPARAKETRIC  PROOEgURBS.  While  I like  "nonparametrlos"  myself, 

there  are  programming  problotns  and  hence  we  will  not  discuss  that  technique 
at  length.  I'he  Idea  Is  this;  Instead  of  mlidmizlng 


^ e 

I (yi  - I 0.x.,)  , 
i«i  ^ j«i 

replace  one  of  the  factors  (y^-E8jXj^j)  by  Its  rank,  say  R^,  and 

minimize  I (y.  - I 0,x.  Jr  . 

!■!  ^ J«1  J ^ 


Please  note  that  R^  is  a function  of  the  0i»02»”‘*®p 

Iterated  process  must  be  used  (while  there  are  a few  short  outs,  the  rank- 
ing requires  most  of  tVie  computer  time). 

Of  course,  this  nonparametrlc  scheme  can  be  generalized  easLly.  Con- 
sider the  "scores" 


a(l)  i a(2)  5,  • • • 5.  a(n) 

and  then 

minimize  I (y.  - f B,x,,)a(R,). 
l-l  ^ imi  J 

Examples i (i)  a(i)  ■ 1,  then  a(R^)  ■ R^. 

(il)  a(l)  ■ J ^ (n+l)/2|  n+1/2  ■ Integer, 

then  a(^^)  ■ 0. 

The  scoring  In  (li)  Is  often  referred  to  as  "median  scores,"  and  these 
scores  were  actually  used  In  the  nonparametrlc  scheme  associated  with  the 
Daniel  and  Wood  example, 

One  final  remark  about  these  nonparametrlc  procedures,  if  a constant 
is  subtracted  from  the  scores  a(i)  so  that  the  resulting  a's  are  such 
that 
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— 1 

a ■ - I a(i)  - 0, 

” i-1 

the  minimization  is  equivalent  to  solving  the  p approximate  equalities: 

n 

Ix,,a(Rj~0,  J-1, 2, •••,?. 

i-1  ^ 

VHiile  several  persons  have  vorked  in  this  areat  I believe  that  Hettmansperger 
and  McKean  [U]  have  developed  the  programs  the  most. 

3.  M-BSTIMATORS . Huber  [6]  first  proposed  these  estimators.  He  sug- 
gested replacing,  in  least  squares,  the  square  function  p(w)  - w2  by  some 
other  p function  and 


“ E 

minimizing  X P (Yj  “ I 

1-1  j-1  ^ ^ 


For  some  theoretical  reasons,  his  first  substitution  was 


p(w) 


2c  w 


|w|  S.  c 

|w|  > c. 


To  clearly  understsuid  this  substitution,  let  y^,ygi“*.y^  te  an  observed 
random  sample.  Let  us  try  to  estimate  the  unknown  middle  0 by  the  method 
of  least  squares,  noting  the  modification  as  we  proceed. 

p(w)  - V and  rain  1 (y.-e)  ■ min  I p(y,-0). 

i-1  ^ 1-1 


Take  the  derivative  and  equate  to  zero  to  obtain 


‘in 


.1 
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where  <|/  ■ p*.  In  Huher's  M-estlmates  (called  this  hecause  if  i|i  ■ 
where  f is  the  density » the  resulting  estimate  is  the  Maximum  likelihood 
of  the  location  parameter  )•  we  have  some  difficulty  because  the  formula 
changea  at  "c".  Huber's  tj*  ■ p*  is 


w < -e, 

-e  <.  w <.  e , 
e < w. 


To  make  the  equation, 


n 

I '('(y^-e)  ■ 0, 

1-1  ^ 

have  a scale  Invariant  solution,  we  need  to  introduce  a scale  factor  s 
ir^  the  following  ways 


i»l 


0. 


A familiar  a used  by  "robustniks"  is  given  by 

{.67^5)s  ■ med(ly^  - med(y^)|)  ■ MAD, 

the  median  of  the  absolute  deviations.  The  constant  c should  be  selected 
so  that  if  yi»y2’~'‘*^n  ® normal  population,  most  of 


the  numbers  |(y^“0)/B|  <.  e.  Values  of  c around  1,5  or  2.0  are  popular. 
In  the  mors  general  regression  situation,  we  could  take 


|y,  - prel.  est.  of  middle | 

B « median  TTSTIT^ * 

(of  non-zero 
deviations) 

where  preliminary  estimate  of  middle  should  be  fairly  robust.  While  numer- 
ically difficult  to  determine,  the  $'s  that  ’ 


minimize 


I - f 
* 1 — 1 


“j-ij 


would  provide  robust  estimates.  However,  while  not  real  robust,  many  use 
least  squares  estimates,  which  is  satisfactory  with  Huber's  procedure. 


The  equations  that  we  must  solve  in  fitting  I B,x. , are,  J«l,2,***,p, 

J-1 


Q fJ 

I ^ “ 

l«i 


Several  iterations  are  usually  required,  and  s would  be  recalculated  on 
each  (there  ore  other  suggestions  for  s in  the  literature  [9l  that  are 
possibly  easier  to  calculate). 

Also  note  that  if  we  wish  to  fit  a non-linear  function  h of  some 

3h 

parameters,  say  > ,6^,  we  simply  replace  x^j  by  -g-g—,  where 

th 

h^  is  h with  the  1 independent  variables  inserted  (that  is,  those 

observations  corresponding  to  y.).  Note  in  the  special  case  h is  linear, 
then  ^ 


Hence,  in  the  non-lineeur  case,  we  solve  (by  iteration) 

n Aj  3h. 

I ■ 0.  J-  1.2. •••.?, 


wh«sre  the  weight 


3h. 

■ tj<(A^/8)/(aj^/fl)  and 

are  found  from  previous  steps  In  the  Iteration  (of  oourse,  recalculating  a 
each  time).  Of  course,  ordinary  non-linear  least  squares  is 

n 3 n 3h. 

min  I L,  yields  d,  ■ 0. 
i-1  ^ i-1  ^ 

Now  we  have 


J ■1,2, 


It,  DESCENDINQ  M-E8TIMAT0RS . Several  statisticians  (Hampel,  Andrews, 
etc.  (1,2])  have  modified  Huher's  function  (and,  of  course,  the  corre- 
sponding p)  with  functions  that  descend  back  to  zero. 


Th«  problem,  in  the  general  regression  situation,  is  still  to  solve 


n A.  3h. 

Fit 


Again,  weighted  (linear  or  non-linear  as  is  the  case)  least  squares  la 
frequently  used.  However,  since  the  corresponding  p function  is  not 
convex,  the  solutions  may  not  be  unique.  Thus,  it  is  extremely  Important 
to  start  with  a reasonably  good  preliminary  estimate  or  else  the  iteratior. 
process  could  end  up  with  the  wrong  solution.  One  way  to  avoid  the  wrong 
solution  Is  through  the  use  of  Huber's  i|^  function  on  several  iterations 
before  using  a descending  i>  function. 

It  is  also  extremely  interesting  to  study  the  weights  associated  with 
the  various  observatlonsi  they  Indicate  the  importance- of  the  points,  In 
particular,  very  low  or  zero  weights  (using  Hampel's  or  Andrew's  i|j)  indi- 
cate that  the  corresponding  points  are  probably  outliers.  To  see  how  all 
of  this  fits  together,  let  us  consider  two  illustrations,  both  of  which 
were  obtained  from  the  statisticians  at  the  Los  Alamos  Sql.  Lab.  In  each 
case,  the  Andrew's  sine  function  was  used. 

Ex.  1.  Evaluatinia:  the  lognormal  asBumption  on  bids  for  wildcat  oil 
leases.  Theire  were  174  leases  under  consideration  and  in  each  case  the 
number  of  bids  ranged  from  10  to  18.  The  logs  of  the  bids  were  taken,  and 
normality  was  tested  using  the  Shapiro-Wllk  W.  In  64  cases  out  of  the 
174,  normality  was  rejected.  Hence  it  seemed  that  bids  did  not  follow  a 
lognormal  assumption. 

However,  it  was  observed  that  there  seemed  to  be  some  very  low  (noise) 
bids  (oil  firms  trying  to  get  a lease  cheap).  Hence,  using  Andrew's  pro- 
cedure, the  middle  of  the  values  was  estimated  and  the  weights  recorded  with 
each  observation.  For  Illustration,  here  is  a sample  of  n “ l4  after  10 
Iterations  (starting  with  w.  ■ l). 


(using  c « l.OO) 


-logd^'^) 

s 

w^ 

15.612 

-.281* 

1.1+05 

15.080 

-.816 

1.133 

15.82I4 

-.072 

1. 1*1*2 

15.872 

-.024 

1. 1*1*1* 

15.896 

.000 

1. 1*1*5 

l>i.9l6 

-.980 

1.009 

111.  763 

-1.133 

.881 

16.1I48 

.251 

1.1*13 

16.2I16 

.350 

1.381* 

16.727 

.831 

1.122 

17.289 

1.392 

.61+9 

13.529 

-2.367 

.000 

17.1*58 

1.562 

.1*95 

10.1*63 

-5.1*33 

.000 

weighted 
mean  ■ 13<696<^| 


outliers 


This  was  done  for  each  of  the  17^  leases.  The  outliers  (low,  but 
noise  bids)  were  eliminated  from  each.  Then  normality  of  the  logs  tested 
again.  In  this  testing,  only  5 of  17^  cases  were  rejected.  That  Is,  about 
3^  were  rejecied,  which  la  In  good  agreement  with  a 'it  testing  procedure. 
Thus  It  seems  that  bids  do  have  an  approximate  lognormal  distribution  once 
the  noise  bids  have  been  eliminated. 


Ex.  2.  Half-life  of  Plutonlum-2ttl.  Six  laboratories  In  the  U.S. 
started  a sample  exchange  program  to  follow  the  Isotopic  content  of  a Plu- 
tonium sample  which  had  some  of  *®®Pu,  ***®Pu,  ***^Pu,  ***^Pu,  and  ^^®Pu, 
the  latter  of  which  was  used  as  a base.  That  Is,  for  example,  values  of 
the  ratio  of  the  contents  of  ®‘**Pu  to  ^**Pu  were  reported  and  denoted 
by  R.  Every  3 to  6 months,  each  of  the  slx  labs  would  report  the  value  of 
this  ratio  giving  a total  of  78  points.  They  wished  to  fit  the  non-linear 

function  h(t)  ■ R^e  . The  data  and  print-out  looked  like  this  after  25 
Iterations. 


item 

^1 

mos 

^1 

"i 

1. 

7.62 

0 

.04471 

.04470 

.00001 

2. 

7.58 

s 

0 

.04468 

s 

.04470 

-.00002 

1 

34. 

s 

4.20 

16 

s 

,04l68 

.04191 

• 

-.00023 

35. 

0.00a 

16 

.04271 

.04191 

.00080 

Wtller  ( : 


There  were  6 points  with  zero  weights  (out  of  78).  The  interesting  thing 
1b: that  upon  checking  these  "bed"  points  it  was  discovered  that  all  6 were 
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from  one  lab,  due  to  a technical  difficulty,  (incidentally,  the  half-life 
seems  to  be  about  14.4  t .1  years.  Without  robust  procedures,  this  was 
about  ik.B  ± 1 year . ) 

While  there  are  more  improvements  to  be  made  using  these  robust  pro- 
cedures, they  already  provide  substantial  protection  against  outliers  or 
bad  data  points  and  oould  be  used  in  place  of  standard  least  squares  proced- 
ures i for  examples,  regression,  ANOVA,  time  series,  and  fitting  by  splines. 

5 . AOKWOWLBPQMEWT . Robert  V.  Hogg's  x.eseareh  on  this  topic  vas  sup- 
ported in  part  by  HIH  grant  OM  22871-02. 
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ESTIMATING  RELIABILITY  FROM  SMALL  SAMPLES 
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ABSTRACT.  Exact  probaOsility  formulae  are  developed • vdth  no  restriotive 
assunqptiohsV  for  use  vath  tests  which  produce  data  of  the  go-no-go  type. 

Although  universally  valid)  the  formulae  are  particularly  apropos  when  small 
sample  size  is  dictated.  Since  a progrenmable  calculator  greatly  faolXitates 
the  solutions,  programming  suggestions  are  included. 

1 . INTRODUCTION . Often  it  is  found  that  military,  economic,  or  time 
limitations  preclude  the  enploymsnt  of  any  testing  taol^que  which  requires 
that  a large  sample  be  taken. 

Statistical  treatment  of  small- sample  data,  always  difficult  enough,  should 
not  be  degraded  by  requiring  unnecessary  postulates  or  by  using  formulae  which 
yield  only  approximations.  Consequently,  the  methods  developed  herein  are  based 
on  no  assumptions  other  than  that  of  xo^om  sampling,  and  the  formulae  yield 

exact  answers. 

Increasing  availability  of  programmable  calculators  with  external  program 
storage  mokes  this  approach  completely  feasible.  With  this  in  mind,  progromndng 
suggastions  are  included  where  they  seem  to  be  indicated. 

Sinoe  the  formulae  are  exact,  there  is  no  theoretical  limit  to  sample  size. 
There  is,  however,  a practioal  one,  depending  jointly  upon  the  size  and  operating 
spaad  of  the  oonoputer  or  calculator  and  upon  the  ingenui  ty  of  the  pxograrnner. 

It  may  prove  helpful  to  insert  here  a few  remarks  on  notation  and 
terminology,  since  there  are  to  be  found  variations  in  the  literature. 

Factorials  are  variously  Indioated  as 
nl  » n^’’^  « 1 • 2 • 3 •••  n . 

Tha  synibol  [n^ls  chosen  for  use,  since  it  acts  as  parentheses'  and  thus  reduces 
confusion  when  parentheses  are  used  for  another  purpose  within  the  same 
expression  (e.g.  Equation  38). 

Generalized  factorials  are 


n^”*^  a n(n-l)(n-2) 


found  as  n^*"^  or  (n)„. 

m 


(n-m+1)  ■ 


In 


will  be  used. 


257 


Binomial  coefflclei^te  appear  In  many  ways: 

/ - \ _0<)  In 

„0|,  • 0(n,k)  -g-.  . 

'nta  synibol  C(nik)  la  adopted,  since  It  can  be  typed  easily  on  a single  llnei 
The  indefinite  aunrution  symbol  la  taken  to  mean 
a ^(a)  + ^(a+1)  + ••♦  + 4i(x-l)  , 

a aeries  which  oonsista  of  exactly  x-a  temia.  The  indefinite  finite  Integral 
thus  is 

A“‘^(x)  a £^(x)  + C . 

Hex'S,  A*‘^(x)  is  analogous  to  /f(x)  dx  in  the  infinitesimal  oaloulusi 

' The  generalized  notation  used  for  a series  is 

S«T  +T  + T + ‘«»+Tj  + »»*  . 

Ill  i 

If  there  exists  soma  value  of  i such  t'*at  Tj  a o for  all  j > i,  the  aeries 
is  finite.  ^ 

Derivatives  are  shown  by  primes: 

^ f (r)  a f-(r)  . 

Level  of  confidence  is  denoted  by  L. 

By  "inaignif leant"  la  meant  "insignificant  to  the  computer."  For  example, 
if  a scries  S is  being  suimsd  and  represents  the  sum  of  the  first  j terms, 

Tj^^  is  Insignificant  if  it  is  too  small  to  affect  the  least  significant  digit 

of  Sj 

2.  BINOKIi^  PROBABILn^ . Sometimes  the  testing  technique  permits  sanpllng 
with  re^acement.  feven  wherPreplaoemant  is  not  possible,  the  same  condition  can 
be  achieved  (mathenwtioally!i  by  assuming'*'  a population  of  infinite  size.  In 
other  words: 

The  aot  of  sanpling  does  not  alter  the  oharaoteristios  of  the  population. 

*TS1b  does  not  belie  the  statement  of  Paragraph  i,  B.'.nce  the  opposite  case  — when 
an  infinite  population  cannot  be  assumed  — also  is  covered  in  Paragraph  4. 


SB  programmed  cialoulator. 
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/' 

t 


^ Given  the  above  condition,  let  us  specify  that  in  a certain  population,  the 

I parababillty  of  observing  a suooeBS  is  given  by  r.  Obviously t the  probabi].ity 

I of  observing  a failure  is  given  by  1-r,  t^ch  we  shall  oall  p, 

I p ■ 1 - r , 

i j. 

I It  follcws  that  (ptr)"  » 1.  Thus,  if  we  draw  a senile  of  size  n,  the 

I probability  of  observing  exactly  k failures  is  given  by  the  ^>propriate  term* 

f of  the  bin^al  expansion 

; (p+r)”^  > C<n,0)  p*'  + C<n,l)  p”“^  r + C(n,2)  p^"^  r* 

+ •••  + C(n,n)  r*'  , 

Sinoe, 

I C<n,k)  p”“*'  ■ 1 , (k  ■ 0,  1,  2,  n)  , (1> 

k>0 

it  may  be  said  that 
C(n,k)  p"“’^ 

defines  a probability  function  in  the  discrete  variable  k. 

Noting  that 


La 

CCn,k)  ■ ■ C(n,  n-k)  , 

we  define 

p(k)  « C(n,k)  p’'  (l-p)*^"^  « C(n,k)  Cl-r)’^  (2) 

as  the  probability  of  observing  exactly  k defectives  (failures)  in  n trials. 

Unfortunately,  the  problem  rarely  is  that  simple.  In  most  test  designs, 
it  is  possible  to  oontrol  the  value  of  n arbitrarily,  and  to  observe  the  value 
of  k exactly,  but  nothing  is  known  about  r.  A probability  function  in  r is 
required. 


•The  tern  containing  p^ 


2S9 


Now  r can  take  any  value  within  the  preacribed  limits,  0 ^ r ^ li 
iiSi,  it  Is  a oontinuous  variable  and  necessarily 

1 

/ f<r)  dr  » 1 (3) 

r«0 

desorlbes  f(r),  whatever  it  may  turn  out  to  be,  as  the  required  probability 
function  in  r.  Setting 


g(r>  « C(n,k>  (l-r)*^ 


<4) 


n and  k being  constant,  we  see  from  Equation  (2)  that  g(r)  is  a density  function 
in  r.  In  order  to  discover  a relationship  between  gCr)  and  f (r) , we  must 
evaluate 


/ g(r)  dr  ■ C(n,k>  / 
raO  0 

To  integrate''* 

(1-x)’^  dx 

let 

u ■ (1-x)*^ 

and 

dv  ■ x*’"^  dx 

Then, 

du  a -k(l“X)^“^  dx 

and 

V « Cn-k+1)"*  , 

^n-k 


(8) 


*The  value  of 


V 

(1-x)  dx  is  found  in  many  tables. 


But  a program  for 


oonputing  / x’^'^Cl-x)^  dx  , 0 < z < l is  squired,  hence  it  is  considered 
0 

desirable  to  show  the  conplate  prooess  of  integration.  These  two  definite 
integrals  are  sometimes  referred  to  as  the  conplete  nd  inoorplete  Beta- 
function. 
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Notfl  tlvat  n and  k are  integers  such  that  0 < n > k > 0. 

a-x)^  dx  ■ x"*^^  (1-x)^  + dx  . 

Another  slitdiar  Integration  by  paints  la  perfonnod  upon  the  last  termt  yielding 


(1-x)’^  dx  « jH 


1 ,.n-k+l 


n^E+T** 

* ^ 

* 


a-x)'* 


f 1 X- 

|n337’' 


* <!' 


Iterating  k tlmaa  results.  In 


-x)*^"^  dx  I 


/x*'“^  ill-x)^dx  ■ (l-x>^  t ^ 


^n-k+2  ^.^jk-1 


. ..  1 k(k-l)'*»(k-k+2)  ..n-k+k  /,  ..sk-kfl 

* * TnTOT^n^^ 

^ k(k-l)*‘*(k-k+l)  ,...n-k+k  /,  ..^k-k  ... 

^ dx  . <s) 

But  now  the  last  term  submits  to  Integration.  It  can  be  rewritten 

li  , j, . Us  la:li  n*i 

■ -m-*  • 

It  now  becomes  vary  easy  to  evaluate  the  definite  Integral 
/ x”"’^  <l-x)^‘  dx  , 


slnoe  at  the  lower  limit,  all  terms  become  zero,  and  at  the  upper  limit  (xsl), 
all  teniis  except  the  last  bocome  zero.  Hence 

^ n-k  k 'Ji  In-K  , 

x.p  ^ ' TnTD'C(n,'E)  * 


Subotituting  this  expression  into  Equation  (5),  we  arrive  at  the  remarkable 
result 

^ ^ n-k  k 1 

/ g(r)  dr  « / C<nik)r"  (l-r>^  dr  » -w  i 

r»0  0 " 


(9) 


i.e.i  / g(r)  dr  depends  upon  senple  size  only!  And  thus,  the  desired 
0 

probability  function  in  r is 


f<r)  « (n+l)  g(r)  » 


(10) 


That  It  be  a useful  probability  function  requires  that  other  definite  inte^ls 
can  be  coirput ed' . " Substituting  Equation  (7)  into  Equation  <6)  and  multiplying 
through  by  (n+l)C(n|k)  enables  us  to  write 


I n+l 


USLZm  n V ^ 

nrnr-R  ci-x)^  dx  ■ c + c(n+i,k-i)x 


,n+l-k+i 


(l-x> 


k-i 


. (11) 


Without  loos  of  generalltyi  wti  can  choose  the  lower  limit  (of  the  definite 
integral)  to  be  zero.  The  function  there  conveniently  reduces  to  the  constant 
of  integration.  Also,  to  avoid  programming  problems,  we  can  restrict  the  upper 
limit  to  VdlvMs  less  than  unity.  Thus,  for  an  arbitrary  value  of  z, 

/*  f(r)  dr  B I C(n+l,k-i)z^  (1-z)^"^  , (0  < z < 1)  (12) 

r»0  ioO 

expresses  the  probability  that  r ^ z.  The  case  of  z ■ 1 already  has  been 
covered  by  Equations  (3)  and  (10),  i.e., 

f(r)  dr  « (n+l)C(n,k)r’^*'^  (1-r)^  dr  » 1 . 

r»0  0 

The  same  fonmila  (Equation  12)  can  be  used  to  solve  the  Inverse  problem} 
i.e.,  when  the  level  of  confidence  is  specified.  Set 


z 

L « 1 - / f(r)  dr 
0 


(13) 


then  solve  for  z. 


'^See  Paragraph  SC. 


3.  ESTIMATES  OF  THE  RELIABILITY. 


A.  The  Funotion  f(r). 

It  is  worthwhile  to  examine  the  probability  funotion 

f(r)  « Cn+l)CCn,k)r"“’^  (1-r)’'  . <10) 

A typical  graph  Cn  ■ 7,  k * 2)  is  shown  in  Figure  1.  The  area  under  the  curve 
is  divided  into  quarters  by  the  ordinates  at  r « 0.567,  0.67S,  and  0.779.  A 
maximum  occurs  When,  exclusive  of  the  end  points,  f'*(r)  a O;  i.e.,  when 
(n-k)(l-r)  » kr  a 0.  We  shall  call  this  maximum  the  "maximum  likelihood 
estimate"  of  the  reliability  and  identify  it  with  a oirouinflex  ('').  It 
ooTTputes  easily  to  be 

r • 1 - ^ . <14) 

n 

When  n~l  > k > 1|  the  curve  ekhibits  two  inflection  points,  equally  spaced  about 
the  maximum.  Ihey  cocur  at 


f’pi  “ ^ “ j^)  • 


<1B) 


As  will  be  seen  later,  they  are  of  Interest  to  the  programmer.  Figure  1 shows 
inflection  points  at  (0.530,  1.55)  and  (0.899,  1.01;. 

When  k ■ 1,  only  one  inflection  point  appears  at 

^ k 

pj  n ’ 

(see  Figure  2).'^  Any  program  must  take  this  fact  into  account. 

B.  Level  of  Confidence. 

It  is  the  nature  of  a function  of  a continuous  variable  that  an  area 
b<ilow  the  curve  (i.e.,  a definite  integral)  cannot  be  described  by  a single 
point.  A pair  of  points  is  required. 

When  the  function  under  consideration  is  a probability  funotion,''*'''  the 
ordinates  erected  at  the  selected  pair  of  points  enclose  an  area  called  the 
level  of  confidence . It  la  proper  to  think  of  a level  of  confidence  as  an  area, 

''•Quariil’esi  r ■ 0.697,  0.799,  end  0.870.  at  (0.714,  2.125). 
b 

i*'*i.e. , when  / f(r)  dr  « 1. 
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as  a definite  integral,  or  as  a probability.  Again  referring  to  Figure  1,  it 
can  be  stated  "at  the  50%  level  of  confidence,  0.567  < r < 0.779"  or  "at  the 
75%  confidence  level,  r > 0.567."  In  the  latter  case,  r s 1 is  the  second 
memiber  of  the  pair. 

Selection  of  a level  of  confidence  nay  be,  and  often  should  be,  quite 
arbitrary.  However,  deferment  of  this  selection  until  after  prelljidn^  teat 
results  are  in,  in  an  effort  to  "inprove"  the  data,  usually  can  be  regarded 
as  a reprehensible  practice. 

When  selecting  a confidence  level  (in  advance,  of  course)  it  sometimes 
helps  in  visualizing  it,  to  couch  it  in  terms  of  ordinary  ganbler's  odds, 
rather  than  the  more  coRmonly  used  decimal  fraction.  Thus,  a confidence  level 
of  0.96  gives  odds  of  24  to  1 ag^dnst  the  analyst  issuing  erroneous  advice. 

At  0.90,  the  odds  drop  to  9 to  1 and  at  0.75  to  an  aleuming  3 to  1. 

However,  there  is  another  side  to  this  coin.  Consider  what  happens  when 
a 100%  level  of  confidence  is  chosen.  Obviously,  the  pair  of  defining  points 
is  located  at  0 and  1,  regydless  of  the ‘nature  and  shape  of  the  probability 
funotion.  Selecting  too  high  a conFidence  level  produces  a strong  nasking 
effect  by  driving  the  defining  points  (limits  of  integration)  far  into  the 
tadls.  A higher-than-neoessary  level  of  confidence  nay  be  a luxury  the  analyst 
can  ill  afford. 

In  sunmary,  there  are  two  approaches  for  handling  the  data.  The  first  is 
to  select  (perhaps  arbitrarily)  two  values  of  the  argument,  then  conpute  the 
level  of  confidence  (area)  between  them.  The  second*  is  to  choose  a confidence 
level,  then  conpute  two  values  of  r which  will  bound  it. 

C.  The  Case  of  Zero  Failures. 

Specifically,  When  k & 0,  the  function  degenerates  to 

f(r)  s (n+l)r^  . (16) 

Additionadly,  given  n > 1 and  r > 0, 
f-(r)  i 0 
f"'(r)  i 0 

and  the  algorithms  which  will  be  developed  will  fail.  The  function  for  n=7, 
k>0  is  shown  in  Figure  3.  Note  that  there  is  no  point  of  inflection  and  no 
maximum  in  the  usual  sense.  However,  we  still  can  define 

n n 


*6ee  Pai^vaph  3D  and  the  opening  renai‘'ks  of  3L. 
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The  solution  of  this  case  is  very  siirple  and  can  be  effected  with  an 
ordinary  table  of  logarithms • since 


/ f<r)  dr  s 

0 


<17) 


The  practical  solution  possibilities  are  limited  to  two. 

(1)  Choose  r a z and  r = 1 as  the  two  values  of  the  argument,  z 
being  arbitrary  but  lees  than  1.  Then  the  level  of  confidence,  L,  is  given  by 

L s 1 _ (18) 

(2)  Choose  L.  Then  set  r = 1 as  the  upper  bound.  The  lower  bound, 
r s z,  is  given  by 


If  the  prograinmar  wishes  to  include  the  case  of  zero  failures,  he  should 
write  it  as  a separate  sub-routine. 

D.  The  Best  Estimate  of  the  Reliability  (0  < k < n). 

When  both  values  of  r are  specified  (r  » z and  r » z ) , the  problem  is 
straightforward  enough.  Slnply  use  Equation  (12)  twice  to  oonpute  L. 

z z 

La  / * f(r)  dr  - / ‘ f(r)  dr  . (20) 

0 0 

If  either  z^  * 0 or  z^  » 1,  then  Equation  (12)  need  be  enployed  only  once. 

But  when  L is  specified,  there  are  an  infinite  nurtber  of  solution-pairs 
which  satisfy  the  required  condition.  The  usual  way  out  of  this  dilemma  is  to 
set  one  of  the  limits  to  be  0 or  1,  then  solve  Equation  (12)  for  the  other. 
Newton's  method  of  successive  approxijiBtions  is  v»ll-suited  to  effect  thl,s 
solution.  An  algorithm  will  be  given  whici)  converges  quite  rapidly  upon  the 
correct  answer. 

Sumetiines  a confidence  level  is  specified  which  arbitrarily  excludes  equal 
areas  from  each  end  of  the  distribution.  llrLs  is  equivalent  to  two  solutions 
with  z B 0. 

t 
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But  a programnsble  oaloulator*  makas  practicable  a more  elegant  solution. 

Let  it  be  called  "The  Best  Estimate  of  the  Reliability."  Briefly  described, 
it  is  this:  The  level  of  confidence  being  speoified,  the  beet  estimate  of  the 
reliability  is  given  by  the  particular  values  of  z and  z which  minimize  the 

difference  z > z . We  shall  designate  them  with  a tilde  thus: 

a t 

2,2  or  f • 

1 a 13 

The  best  estimate  of  the  reliability  possesses  several  distin{Mishing 
properties: 

(1)  The  solution  is  unique. 

(2)  z - z is  a minimum,  by  definition. 

a 1 

(3)  f(2  ) » f(2  ) . (21) 

1 a 

That  this  is  true  is  evident  from  Figure  4.  If  either  ordinate  in  displaced 
away  from  the  maximum,  the  other  must  be  displaced  a smaller  amoui'tt  to  conserve 
area;  i.e.,  z - z increases.  This'  important  equality  is  made  use  of  in  the 
solution.  * ' 

<4)  2^  and  2^  always  lie  on  opposite  sides  of  Thus  is  avoided  the 

absurdity  of  excluding  r from  the  solution  area.  This  property  also  is  used  in 
the  solution. 

(S)  Any  included  value  of  r is  more  likely  than  every  excluded  value. 

Note  that  vhen  k s Q,  the  solution  is  degenerate.*  This  should  not  be 
surprising,  since  r c 1 yields  an  absolute  extremal,  not  a relative  one. 

E.  Conparison  of  Methods. 

It  is  oonmon  practice  to  specify  L,  then  set  z b 1 and  ccnpute  z . Under 

these  conditions , z is  a function  of  L.  Although  this  does  not  invalidate  the 

method,  it  indicates  that  due  caution  be  exercised,  lest  the  published  value 
of  z^  reflect  little  more  than  the  analyst's  whim.  The  method  can  make  only  one 

kind  of  statement,  viz.  "At  the  75%  confidence  level,  r exueeds  0.667."  No 
attempt  is  made  to  predict  what  r actually  is  (it  may  be  far  from  0.567)  and 
nothing  is  said  about  the  shape  of  the  distribution,  save  that  the  ri|^t~hand 
"tail"  surely  is  included.  The  rmthod  might  be  used  by  a manufacturer  or  user 
to  test  for  octroi ianoe  with  a minimum  standard . 


*Saa  Paragraph  3C. 
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On  the  other  hand,  the  beat  estimate  of  the  reliability  states  "The 
maxijiium  likelihood  estimate  of  r is  0.714  and,  in  any  event,  at  the  75%  level 
of  confidence  r lies  between  0.519  and  0.867." 

The  values  of  are  0.433  and  0.348,  respectively. 

The  "best  estimate"  mi^t  be  used  to  evaluate  a new  device  or  procedure, 
without  reference  to  a pre-established  criterion. 

In  a nut-shell,  one  method  measu^s , the  other  tests  for  ocarpliance . 

Before  choosing  between  them,  the  analyst  must  decide  what  sort  or  question 
he  is  atten^ting  to  answer. 

F.  The  Effect  of  Increasing  Sample  Size. 

What  happens  when  the  same  failure  rate  is  observed  in  a larger  sample? 

This  is  graphically  illustrated  in  Figures  4 and  6.  It  is  observed  that  ? is 
unchanged,  but  f(r)  increases.  Also,  2^  and  2^  both  move  inward  toward  rj 

i.e.,  2^  - z^  decreases.  It  is  clear  that  enlarging  the  sample  size  will 

increase  the  precision  of  the  "best  estimate."  If  n becomes  g^t  enou|^,  the 
graph  of  the  function  virtually  is  reduced  to  a tall  spike  at  r. 

4.  HYPERGEOMEmiC  PROB^ILITY.  When  test  conditions  do  not  permit  sanpling 
with  replacement,  and  when  the  population  is  known  to  be  finite  (and  maasureablel) 
in  size,'^  the  theory  of  Paragraph  2 is  not  applicable.  We  must  perforce  develop 
another  method  for  dealing  with  sampling  without  replacement.  To  parallel  our 
earlier  statement,  we  say: 

The  act  of  sampling  measurably  alters  some  characteristic  of  the  remaining 
population. 

In  this  Paragraph,  we  shall  not  speak  of  the  reliablity,  nor  shall  we 
employ  as  a symbol  the  letter  r.  (As  will  be  ceen,  the  analogous  quantity  is 
1 - x/N.) 


Given  a population  consisting  of  N items,  x of  which  are  defective,  the 
probability  that  a sample  of  size  n will  contain  exactly  k defectives  is 


p(k)  ■ p(N,x,n,k) 


|N-n-x4k  I n-k  |x-k"|ht  | k 


(22) 


Notice  that  x and  n are  interchangeable  in  the  formula,  which,  at  our  convenience, 
can  be  written  in  either  of  two  ways: 


(22) 


'*e.g.,  test-firing  guided  missiles. 
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But  In  the  usual  cassi  Nt  n,  and  k are  kncMn  and  it  is  required  to  estijnate 
x;  i.e.,  X IS  THE  ONLY  VARIABLE.  What  is  needed  is  a probabilitj'  function  in  x.* 
Now  Equation  (22)  in  any  of  its  forms  gives  p(k)  as  a probability  function  in  k, 
but  not  neoessarlly  in  x.  It  is  observed  that  with  n and  k being  held  constant » 
p(k)  serves  as  a density  function  in  the  discrete  variable  x.  In  atteirpting  to 
disclose  the  relationship  between  p(k)  and  the  desired  probability  function  in 
X — which  we  shall  write  as  p^^  = Pj^(N,x,n,k)  — we  niust,  as  the  first  step* 

evaluate  the  finite  definite  integral''^**' 

k+N-n 

Qv  * Qv^N,n,k)  a I p(k)  , (23) 


a series  consisting  of  N-ntl  terms.  The  llinlts  of  integration  are  obvious,  since 
k defectives  already  have  been  observed*  and  N-n  is  the  population  r<eminlng. 
Substituting  Equation  (22)  in  its  first  form  for  p(k)  and  factoring  out  the 
constants  (Which  do  not  contain  x)  we  find 


N~n  [n  k+N-n 


I 

x»k 


^x  “ In-k  IN  iTT 


(N-x) 


(n-k)  „(k) 


(24) 


where  (N-x)'  ' and  x denote  generalised  factorials.***  An  expression  for 

this  integral  is  obtained  as  follows: 


Let 


Ujj  » (N-x)^'^"*'^ 


and 


^(x)  s 


Then* 


C + lUj^(^(x)  a A-*Uj^:^(x)  8 (EE'  - l)“>  u^^(x)  (25) 

*of  i:Ha  discussion  following  Equation  (3),  Paragraph  2. 

**i.e.,  sum  the  finite  series  over  all  possible  values  of  x. 

***The  basic  reference  for  the  following  derivation  is  George  Boole's 
"Calculus  of  Finite  Differences."  Boole's  notation  (thii^  and  later 
editions)  is  used  throughout. 
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Mhere,  teinporarllyt  E operates  on  u alone,  £"  on  ^ alone.  Continuing, 


CEE'-  l)-»u^^<x)  « ca  + A)  E'  - ir‘u^0(x> 
■ (L'  + AE“)"*u^^(x) 


• -r 


i (i  + ^j"‘u^(|)<x> 


, AE 
1 - -jr 


Ujj^Cx) 


(2t 


From  Equation  (26)  we  can  write  the  desired  expansion,  dropping  the  primes  as 
no  longer  necessary. 

EUjj^(x)  « -C  + UjjE^(x)  - AUj^E*^(x+l)  + A*UjjE*^(:^2)  - ••• 


• ••  + (-l)^A^Ujj  E^'*’^^(x+J)  + •••  , (27) 

Ihe  aeries  of  Equation  (27)  will  terminate  after-  n-k+l  terms,  fewer  by  N+k-2n 
than  that  of  Equation  (23).  It  oan  be  used  to  sum  any  nuniber  of  terms  of 
Equation  (23)  or  Equation  (24). 

It  will  prove  useful  to  list  a breakdown  of  the  terms  in  Equation  (27). 
Ihis  is  done  below. 


>4 

t\  . 

t 

f 

e 

(-D^A^Ujj  » 


(N-x)^"“^^ 

(n*k)(N-x-l)^"“^“^^ 

(n-k)^^^N-x-2)^"’’''^^ 

<n-k)^^^N-x-j>^"‘^’^^ 


(.1)"-V-V 


„(k+l) 


E‘^iCx+1)  « 


<k+2> 

TJT 


E*<j)(x+2)  a 


Ck+2)' 
<k+3) 


♦<X+j) 


jk+ajTir 
^(k+j+i) 


<k+j+l> 


T3TIT 


(n+1) 


(n+1) 


<k) 


Noting  that  when  x < k|  ^(x)  ■ x * 0,  we  have 
k-1 

I 0(x)  a 0 
x«0  ^ 


and 


(29) 


I U A(x)  a (N-k)^”“^^  [k  • 
xeO  * 

nrom  the  definition  of  tlie  operator  £,*  we  have 
k-1 

EUj^^(k)  a I Ujj^(x)  a0*-C+0+0+**‘ 

■ince  all  E^^(x>  vanish  when  x ■ k.  Thus  C ■ 0/*  and 

EUjj^(x)  a Uj^E^Cx)  - AUj^E*^(x+l)  + •••  (30) 

holds  for  all  admissible  values  of  x,  i.e.  ,k^x<k'^N-n. 

"Paragraph  1. 

more  rigorous  demonstration  is  given  in  Appendix  A. 
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Again  remarptoering  the  definition  of  the  operator  E,  we  can  evaluate  the 
•)g>re8slon  in  Equation  (24)  as  follows: 


T 

XBk 


Utilizing  Equations  (28)  and  (29)  to  write  down  the  full  t)qpansion,  we 
obtain 

^ (N-x)^*'“^^  « (k+M-n)^^^  [n^ 

x«k 

, . (N-n+k)^^*^^ 

+ ln± ^ 


<n-w  Lttjsii 


(lct2) 


♦ (n-k)^^^  I n-k-2 


(N-n+k+2) 


(k+3) 


(k+3) 


♦ + (n-k) 


(n-k) 


T3T 

(n+1) 


(32) 


Since  0 B (^  ■ 1,  and  since  I n-k  is  a factor  of  every  temi»  the  last  equation 
beocmas 


I " (N-x)''^’'*  x'’'*  • (ni 


T 

xBk 


(N-n+k)^*^^  + 

. (N-n+k+1)^^"^*^  X (N-n+k+2) 

♦ ^ f 


(k+1) 


(k+2) 


(k+3) 


(k+3) 

tTJ 


.(n+1) 


+ + 


('n+l)'"-’^*l> 


(33) 


To  sum  the  series  inside  the  braces,  we  return  to  the  list  of  Equation  (29)  and 
notice  that,  except  for  the  first  term,  we  have  exactly  the  values  taken  on  whan 
X B N-n+k.  The  first  term,  of  course,  is  ^(N-n+k).  Whence  we  con  write 
symbolically 
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T"  x‘’'>  ■ Li*  1 1 ♦ Mr  ♦ 

x*k  ^ [ 


,ji«k 


« I n-k  \ 1 + 


I 


n-k+1 


|U; 

tht  bAslo  dofinltlons  of  the  operators » 


Lni<jl-|^-(|p']} 
n-k  I (I  <l><N“i^+k) 


^CN-n' 


0CN-n+k) 


k n-k+1 


t()Cy>  « V^(y+n-k+l)  , and 


x^’^^  ■ I n-k  ^<N+1>  . 

XBk 

Again  referring  to  Equation  <29), 

’^r'"  HI  III  <"-’<>  ■ I n 1-  ^ ^ ^ 

(N-x)  X ■ LD±  “ iMUil  • 

Substituting  'tills  value  into  Equation  (24)  yields  the  desired  integral 
Qx  * Jj^  P<N,x,n,k)  ■ ^ . 

Finally,  the  required  probability  function  ^ x is 

n+1  1.M-X  ln+1 

Px  “ Px  (N,x,n,k) . p(k) « in^”n:rPic-rTn::rTxiT<i^^^^ 


_ C(N-x,  n-k)  • C(x.k) 


2 


Note  that  this  probability  funotion  differs  by  only  a constant  multiplier 
from  the  original  function,  p(k),  given  in  Equation  (22).  However,  x and  n no 
longer  are  interchangeable,  due  to  the  prenenoe  of  the  factor  n*l» 


The  technique  of  Equations  (27),  (28),  (29),  and  (30)  can  be  used  to  sum 
any  nunber  of  terms  of  the  probability  integral.  Thus,  provided  only  that  m 
Is  soma  proper  value  of  x,  (k  ^ m ^ k + N - n) , 


iN-n  L2+I 
'|n:TriNTmc 


"f  (N-k)  k'''’ 

xBk 


iN-n  Inti 

Vn-iriTtril^' 


(39) 


gives  the  probability  thax  FEWER  than  m defectives  will  be  found  in  K.  As 
previously  noted,  (Equation  (27)),  the  right-hand  side  of  Equation  (39)  will 
contain  n-k+1  terns,  An  alternate  expansion  for  L’Ujj^(|((m)  which  sums  in  fewer 

textna  whenever  m < n is  given  in  Appendix  B.  This  alternate  expansion  is 
preferable  for  prograjimlng. 


The  graph  of  the  function  is,  of  course,  a histogram  oonposed  of  rectangles 
of  equal  ^dth  but  varying  height.  (Flg\u«  6).  For  any  arbitrary  value  of  x, 
the  area  (integral)  of  1he  corresponding  rectangle  can  be  oonputed  by  Equation 
(38).  The  oombined  area  of  any  number  of  oonsecutive  rectangles  can  be  conputed 
by  Equation  (39)  and  interpreted  as  a level  of  confidence. 

The  inverse  problem  is  not  so  clear-cut,  however,  sinoe  no  attempt  is  made 
to  attach  meaning  to  "a  portion  of  a rectangle."  Thus,  any  asslxned  nonfidenoe 
level  must  include  the  phrase  "greater  than"  or  "less  than."  Repeated  application 
of  Equation  (39)  to  suoceasivs  values  of  x will  reveal  the  correct  answer.  It 
may  be  useful  to  employ  Equation  (13)  to  obtain  a fairly  close  first  approxijnation. 

Borrowing  the  terminology  of  Paragraph  3 and  referring  to  Figure  6,  we  oan 
make  statements  like; 


"That  X < 12  exceeds  the  80%  confidence  level,"  or 

"Best  estimate  of  xi  x ■ 7,  and  at  the  0.74891  level  of  confidence, 

X ■ 11  and  X ■ 4*,"  i.e. , 4 < x < 11. 

B.  COMPUTATIONAL  PROCEDURES. 

A.  Significant  Digits.  The  occurrence  of  large  factorials  really  permits  no 
alternative  to  computation  by  logarithms.  Now  two  prooesses  Wlnioh  are  prodigal 
of  significant  digits  are  subtraction  of  nearly  equal  numbers  <ind  computing 
anti  logarithms.*  We  can  be  subjected  to  both  haz^s  witliin  the  same  algorithm. 
Therefore,  it  is  suggested  that  computations  be  carried  to  12  or  13  significant 
digits.  For  machines  which  do  not  compute  logarithms  accurately  enough,  the 
follcwing  is  suggested I 


T^OITTb  of  the  order  of  5300.  Four  significant  digits  will  be  lost  when 
subsequently  passing  to  an  antilogarithm. 
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2 

3 


0. 07117 

0. 09166 


0,  10475 


0,  10999 


0.  10806 
0,  10034 

0. 08B54 
0. 07440 


12 

13 

14 

15 

16 

17 

18 

19 

20 
21 


0. 05952 
0.04522 
0.03246 
0. 021B5 
0.01362 
0.00772 
0.00386 
0.00162 
0. 00031 
0.00009 
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Ejqsress  the  number  in  scientific  notation  thus;  378  = 3.78  x lO*.  If 
tJie  resulting  units  digit  is  It  proceed  directly.  If  it  is  2 t 3,  4,  or  S,  divide 
the  left-hand  meiriber  by  e = 2.718  281  828  459,  intending  to  add  In^e  « 1 to  the 

result  later.  If  the  units  digit  is  6,  7,  8,  or  9,  divide  by  e*  and  add  2 later. 
Call  the  resulting  nuitiber  y.  For  our  present  exaiiple, 

y s ItZi  a 1,39  ••• 

Now  use  the  transfomation 


The  series, 

%ln^y  = C + 4 * 4 4 + •••  , (40) 

will  converge  rapidly.*  The  exponent  is  recovered  by  adding  or  subtracting 
In  10  = 2.302  585  092  994,  a suitable  number  of  tijnes. 

B.  Stirling's  Formula  for  In.  Bemouilli’s  Nuiiibers. 


Stirling's  formula  for  [_n  is 


where  S is  the  asymptotic  series 

B n"*  B n-"  B n“* 

s “ 1 “ “tt"  br  * ***  • 

The  Bj  are  Bemouilli's  numbers,  the  first  six  of  which  are 


B 

\ 


B 

I 


B 

B 


B 

11 


691 

7m 


(41) 


(42) 


*To  continue  the  example, 


f.  0.39»*» 
^ “ 2.3§.** 


0.163* • and  the  eighth  term  is  1.02  x 10' 


For  thirteen-digit  aijc’  ■’acy,  n > 11  requires  four  terms  of  the  series  S, 
n > 39  but  three  teinus.  Thus, 

^e  * **ln^C27rn)  + nCln^n-l)  + - ijjj^  ^1  - T^r  » n > 11 

or 

iHg  [n  = >slng(2Tm)  + nOn^n-l)  + 3^  - 15H®')  * ^ ^ 

Logarithms  of  smaller  factorials  must  be  cotrputed  directly • of  course. 

C , Newton ' s Method . For  the  solution  of  otheivise-difficult  inverses, 
Nev»ton ' s method  of'  successive  approximations  is  indispensable.  However,  certain 
precautions  must  be  taken  by  the  prograrrmer. 

Ideally,  the  graph  of  the  function  is  an  ogive.  But  it  serves  the  purpose 
equally  well  if  two  values  of  the  argument  can  be  found  which  surely  bracket  the 
desired  solution  and  between  which  the  function  behaves  like  an  ogive. 


inflection 


FIGURE  7 


The  basic  operation,  of  course,  is 


* *1  * 


f(z)  - f(x^) 


"l.B.  , 

between. 


by  a maxijiium  and  a minimum,  with  a single  point  of  inflection 


f(z)  being  given,  and  from  Which  it  is  required  to  find  z.  Let  the  first 
appfro^djnatiMi  be  taken  at  the  inflection  point.  Since  the  slope  is  steepest 
there,  it  insures  that  the  approximate  solutions  will  not  overshoot  the  true 
one.  Thus  x, 's  will  rem^  within  bounds,  avoiding  a spurious  solution  or 
runaway.  * 

Were  we  to  e}g>ress  the  cumulative  probability  of  Equation  (12)  as  a 
function  of  z. 


z 

F(z)  ■ / f<r)  dr  , (45) 

jpaO 

we  would  find  that  its  graph  is  a true  ogive,  that  its  derivative  is  siirply 
F^Cz)  a f(r),  and  that  the  inflection  point  occurs  at  r. 

D.  Suimation  of  Syies.  Many  of  the  formulae  herein  developed  for  use 
Involve  the  summation  of  series.  A convenient  way  of  handling  tJds  type  of 
conputation  in  a progreitmable  calculator  is  to  discover  and  enploy  a term-to- 
tezm  recurrence  relaticnship. 

Usually,  infinite  series  offer  no  problem.  For  exanple,  in  Equation  (40) 
we  can  choose  to  assign  only  odd  subscripts  to  terms,  whence 


e<i)  » ^ is  known  as  the  reourrenoe  ratio.  It  Is  of  most  use  to  the  progranmer 
when  it  is  a oonstant  or  a funotlon  of  position  only. 

Finite  series  ostensibly  offer  a choice  — they  can  be  sunmed  from  either 
and.  Not  really.  When  there  are  only  a few  terms,  it  probably  makes  no 
difference.  But  when  there  are  many,  the  least  term  always  should  be  left  until 
last.  There  are  three  conpelling  reasons  for  this: 

(1)  The  earlier  the  large  terms  are  confuted,  the  less  accumulated 
round-off  or  truncation  error'  they  will  contain. 

(2)  When  errplcying  a recurrence  ratio,  no  term  can  contain  nore 
significant  _ digits  than 'the  first  term.  In  a fixed  point  machine,  computing 
the  least  significant  term  first  may  result  in  ccaiplete  disaster. 

(3)  If  some  .terms  are  insignificant,  it  is  unnecessary  to  :<?aste 
oonputer  time  on  them,  provided  the  siimificant  terms  are  confuted  first. 

In  this  case,  the  effeol:  is  quite  similar  to  sunrhing  aiTlhf irate  series. 
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E.  Exiting  a Loop.  Many  of  the  formulae  developed  can  advantageously 
ORploy  an  iterative  prooess  in  the  coRputationi*  A progirain  must  eniploy  soma 
device  for  terminating  this  process i i.e.i  exiting  the  loop.  Basioallyt  there 
are  two  oases  vMoh  must  be  treated. 

The  first  occurs  when  the  number  of  iterations  is  known » or  oan  be 
deteni^ed  readily.  The  programmer  nerely  finds  a factor  (or  sets  up  a 
duinry  index)  which  is  )mcfwn  to  reach  zero  eventually,  and  tests  it. 

The  second  (and  more  sensitive)  obtains  when  the  nurrtber  of  iterations 
depends  upon  the  results  of  the  calculations.  It  is  a mistake  to  test  the 
untreated  single  term,  since  it  may  become  insignificant  to  the  result,  but 
yet  not  zero.  It  is  tenpting  to  test  the  difference  between  two  successive 
solutions,  but  it  is  possible  (particularly  with  Newton's  method)  to  reach  two 
alternating  solutions  which  differ  only  in  the  least  significant  digit.  A 
nearly  foolproof  procedure  is  to  establish  a maximum  allowable  error  (call  it  5), 
subtract  it  from  the  absolute  value  of  the  quantity  in  question,  then  test  the 
sign  of  the  diffeience^  ti  iray  be  necessary  or  desirable  to  choose  a 6 which 
squanders  two  or  three  (ostensibly)  significant  digits,  in  order  to  hasten  the 
exit. 


of  Series,  Newton's  Method,  Factorials. 
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APPE3®IX  A 


EVALUATING  A CONSTANT  OF  INTEGRATION 


Equation  (27)  states 

Eu^*(x)  s -C  + Uj^E^Cx)  - AUjjI*<^(x+l)  + A*UjjE*^(xf2)  - ••• 


• •*  + (-l)^Ajj  r^'^^(|»(x^J)  + •••  (27) 

and  it  is  raquirad  to  evaluate  C»  the  constant  of  integration. 

Now  the  admissible  values  of  x are 
k<x<l«  + N-  h 

and  the  fastidious  may  object  to  the  development  and  inclusion  of  an  expression 
like 

k-1 

I u^<i(x)  a 0 
x«0  ^ 

So,  let  us  increase  the  upper  limit  by  unity.  That  the  expression 
xiO  " 

has  a real  sum,  and  that  the  sum  is 

\ u 4(x)  a (N-k)^""^^  \k 
x»0  * 

there  can  be  no  doubt. 

Continuing  in  the  manner  of  Equation  (30),  we  have 


\ “ (N-k)^"“^^  [k 


= -C  + (N-k-1) 


(n-k)  ^ (n-k)(N-k-2)^"-^-^^ 


(k+2) 


nr 


+ (n-k)^^\N-k-3)^""^“^^  , 

(k+3)'^' 
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If  we  transpose  -C,  then 


IjfellL 
TKT  “ 


(kth)' 

Is  a factor  of  ti\e  rl^t-hand  side,  so  tiiat 

4-  (n-k)(N-lc-2)^”"^“^^ 


C + Li  “ Us 


If  we  substitute  x “ ktl  into  Equation  (28),  we  obtain  exactly  the  succession  of 
terns  esdiibited  within  the  braces  above.  This  allows  us  to  write  symbolically 


C + (N-k) 


<n-k) 


[k.  Ik 


1 - 4 + 4’  - A*  + •••  ♦ (-1)""’'4'’"’' 


^k+1 


» Ik 


1 + A' 


n-k+1 


‘Vi“ 


Us  I 


\ti 


.n-k+l 


since,  by  Equation  (28),  A ^ 

<n-k) 


C <N-k)'"”^^  [k  » [k  * Us  \ " Us  (N-k) 

iiSi , C ^ 0*  Q.E.Di 


Cn-k) 
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APPENDIX  B 


AN  ALTERNATE  EXPANSION  OF  EQUATION  (25) 


An  alternate  expansion  of  Equation  (25)  follows.  As  before,  let 


■ (N-x) 


(n-k) 


Then, 


^(x)  s x^^^ 


C + EUj^<|l(x)  a A"'Ujj(||(x)  « (EE-  - l)"»Uj^(f)(x)  , 


where,  teii5>orarily,  E operates  on  u alone,  E'  on  alone.  Continuing  in  a 
Afferent  manner, 

(EE'  - l)"\^(x)  s [Ed  + A')  - ir'Ujj^(x)  = (A  + EA')-‘u^$(x) 


* ET'  (l  + 


ET  I ^ * (et)  ' (e^) 


P)rom  Equation  (B-1),  we  can  write  the  desired  expansion,  once  more  dropping  the 
primes  as  no  longer  necessary. 

ru^4>(x)  s -C  + Ujj_^E<|)(x)  - AUj^_2E»(^(x)  + 

- •••  + (-l)^A^Ujj_j^3_  r^^^iXx)  + •••  (B-2! 

Again  it  is  tiseful  to  list  a breakdown  of  terms. 


"^V2  “ <n-k)(N-x+l)^”"’^”^^ 
^*V3  * (n-k)^^\N-x+l)^’^“^*^^ 

4 

4 


(.1)"-V-Vntk-1  = li!i 

''"■’'Vn^k-2  = » 

♦(x)  s 

^(k+1) 

2^<x)  « 

, „(k+2) 

Z*4*^x)  * ' ■"■7’w'r 

(k+2)^^' 

(k+3) 

Z»(^(X)  » -i pjy 

(k+3)'®^ 

J;3+1^(X)  = -2 

(k+j  + 1)'^ 

t 


j,n-k+l^ 


L 
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Following  the  method  of  Appendix  A, 


Ik  = Suj^^^^(k+1) 

e -C  + (N-k)^*'"’'^  - 0 +•  0 - •••  (B-5) 

since  all  Z^(Kx)  vanish  when  k •*•  j > x. 

Sinplifying, 

(N-k)^"“^^  [k  = i:uj^^^4»(k+l)  = -C  + (N-k)^"“^^  [k 
and  again  C a o.  Thus,  Equation  (B-2)  can  be  written 

ZUjj<ti(x)  = ■•••••  » (B-6) 

which  holds  for  all  admissible  values  of  x.  IVom  Equations  (B-3)  and  (B-4),  it 
is  apparent  that  when  substit\ited  into  Equation  (39),  the  e>pansion  never  will 
contain  more  than  n - k + 1 terms,  and  will  contain  fewer  whenever  m < n. 

Other  expemcions  are  possible,  but  usually  prove  to  be  more  cumbersome 
than  the  two  already  developed. 
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APPENDIX  C 


PROGRAM  PLANNING  - BINOMIAL 


1.  INTRODUCTION.  Reliability  is  ejq>rassed  by  2 or  by  r,  depending  upon 
Whether  or  not'  ft  is  a limit  of  integration. 

In  general,  loops  will  be  exited  by  cotiparing  the  difference  between  two 
successive  itex'ations  with  soma  standard,  (S.  (Sea  Paragraph  5E.) 

Nearly  every  formula  of  interest  is  greatly  ainplified  if  expressed  as 
a function  of  f(r).  Thus, 


Ptk)  ■ f(r) 

f'(r)  = 1^:^  . f(r)  • i [n  - i^]f(r)  . 

z 

/ f(r)  droT+T  + T + ••• 
r*>0  12  1 


where 


T 


J 


zf(z) 


and 

z 

^i+1  * ra-F^m-zT 


h « k,  h s k-1,  h s k-2,  etc. 

12  I 


Note  that  if  n and  k do  not  change,  there  is  no  need  to  compute 

I In-k  - In^  |j< 


more  than  once. 


COMPUTING  L (z  specified).  Equations  (12)  and  (13). 
Enter  data 


yes  sub-routine  for  aero  failures 

no 

iTEE 

Conroute  (n-k)  In.a  + k In.(l-z) 

Add  the  above,  yielding  In.f(a) 

Corrpute  and  store 
Set  h B k 

i 

Flag 

Confute  Titl  and  add  to  partial  sum 
deorement 

return  to  flag 
yea 

Subtract  final  sum  from  1 
End.‘ 
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3.  COMPUTING  z (L  specified).  Equations  (12),  (13),  (44),  and  (45). 


Enter  data 


yes  — ^ BU)>routine  for  zero  failures. 


no 

In^-l 

Ooirpute  and  store  In^  'jT<"  Tn"l< 

Conpute  and  store  r * ^ ^ first  estijnste  of  z. 


Flag  0 

Ooirpute  (n-k)  In^Zj  + k In^d-Zj) 


Add  In^ 


n-t-1 

Lmbmm 

"e  Ik  ln~k 
Conpute  and  store 


, yielding  In^f(Zj) 
T 

t 


Set  h a k 
1 

nig  [T] 

Conpute  and  add  to  partial  sum 


Deoremsnt 


j 
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Store. 


yep 

I 

*j+l  * ^ * 

Return  to  Flag  S 

4.  COMPUTING  "BEST  FSTTMATE  OF  IHE  RELIABILITY.”  (L  specified).  It  migjit 
be  said  that  the  method  eirployed  is  (Newton)^.  Iherefore,  it  is  mandatory  that 
the  program  include  realistic  exit  routines,  in  order>  to  keep  conputer  time 
within  reason.  Both  and  z^  are  confuted.  Equations  (14),  (IS),  (21),  and 

Paragraphs  2 and  3 above  are  eiiployed. 

Enter  data 


yes 


generate  error  message 


ln*l 

Conpute  and  store  In^  j 


yes 


r 

Conpute  and  store: 
2k 


k=ir 


r a 1 - 
pi  n 


a 1 - g (1  - L) 


no,  k > 1- 


1 


Oonpute  and  store: 

-p.  = ^ /f'-f(i-ili) 

f (i . 


2»0 


p,.1 

■ ,■  ■ i 

- . t 

u 

■ i 
t 

" j>‘i 


■4 


The  are  the  successive  estimtes  of  2^ . The  progfram  will  not  run 
with  = 1,  hence  the  above  split.  To  continue: 

FH«  [3 

Ocnpute  (n-k)  + k In^d-b^) 

Ccopute  and  store  fCb^) 


To  compute 


f(r)  dr,  call  ‘'integral*' 


subroutine. 


Store. 


Set  a = r^ 

0 Pi 

Flag  0 

Conpute  <n-k)  + k In^(l-a^) 

Oonpute  f(aj> 


Compute 


«aj) 


3 

I 

I 


f(b.)  - f(a.) 

•j+i  “ *j  * —rcajr""'^' 


no  ^return  to  Flag  2 


yea 

I 

To  compute  / f(r)  dr,  call  "integrial"  subroutine 
0 


}. 

•I 

i 


£ 

» 


NOTE:  Since  at  this  point  in  the  solution 

f(a)  = f(b),  the  second  fractional 
exE»r«8sion  reduces  to 


"”i:a 

‘ “ I 1^^ 


K \ 

i=e) 


+ 


end. 


Subroutine  ’’integral” 

From  X and  f(x),  oonpute  and  store 

T « (n+l-k)“‘  X f(x) 

1 


! 

i 

I 


set  h ‘ k 
1 

Flag  Q] 

Compute  and  add  to  partial  sum 

hn  ' (n+5-Hj^Xl-^ 

Decrement  h^ 
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APPn®IX  D 


PROGRAM  PIANNING  - HYPERGEOMETTRIC 


I.  INTRODUCTION.  The  variable  x is  to  be'  assooiated  with  the  probability 
of  a ‘specIHcniiE^ of  defectives.  The  variable  m is  to  be  associated  with 
the  ouniulative  probability  that  FB^/ER  than  the  stated  number  of  defeotives  exist. 

In  general I the  series  to  be  summed  are  all  finite » but  when  both  m and  n 
are  quite  large,  it  will  measureably  hasten  exiting  the  loop  to  oonpare  the  term 
with  sans  arbitrarily  small  standard,  r«ther  than  zero. 


The  formulae  of  interest  are  conveniently  e^^ressed  as  functions  of 
pjj(N,x,n,k).  Thus, 

_ _ C(x,k)C(N-x,n-k) 

Px  " “"OTITHTTJ — 

^ Px 


(D-l) 

(D-2) 


nwl 


y o «T  +T  + T + •»• 
Sv  1 a I 


x«k 


(D-3) 


where 


* _ C(m,k+l)C(N-mtl,n-k) 


_ m-k  . N-nrt-1  . _ 

' Pm 


(D-4) 


and 


« _ m-<k+l)  . (n+l)-<k+i)  . 

^i+1  “ “FThT  • ^i 


(I>-S) 


Note  that  in  implementing  the.«\bove  formula  (D-6),  the  factors  of  the 
numerator  should  be  conputed*  before  incrementing  the  index,  the  factors  of 
the  denominator  after.  In  fact,  under  this  sohoms.  the  denominator  of 
Equation  (D-4)  becomes  equivalent  to  that  of  Equation  (D-5),  and  the  index 
oan‘  be  set  to  k+1  before  computing  T . 
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2.  CX)MFUriNG  P-  (x  ARBITRARY).  Equations  (38)  and  (43) 


Although  the  oonputatlon  of  nine  logarithms  is  Involved,  there  should 
be  no  difficulty  encountered  worthy  of  notice.  It  is  preferable  to  eonpute 
end  add  (±)  the  largest  logarithms  last.  ( I Ntl.  |N»m.  and  |W<»x  will  be 
the  largest  numbers.) 


3.  COMPUnNQ  I (m  SPECIFIED).  Equations  (39)  and  (B-6) 

x«k  ■ 

The  program  plan  is  left  to  the  reader.  Sufficient  suggestions  should 
be  found  in  Paragraph  2,  ^pendix  C,  and  in  Paragraphs  1 and  2,  above. 

4.  COMPUTING  m (L  SPECIFIED).  See  Paragraph  4,  "Hypergeometrio  Probability." 
The  problem  is  to  find  an  integral  value  of  m such  that 


m-2  m-1 

I p^  < L < I p 


(D-6) 


A first  approximation  is  obtained  by  using  the  method  of  Appendix  C, 
Paragraph  3 to  solve  for  z from  the  observed  values  of  n and  k,  then  applying 
the  transformation 


m B k - ■»  + (l-z)(N-n+l)  + c 


(D-7) 


The  quantity  0 _<  e < 1 is  necessary  to  insure  that  m is  an  integer.  A study  of 
Figure  6 will  reveal  why  Equation  (D-7)  is  a suitable  transformation. 

In  actual  practice,  e need  not  even  be  determined.  Instead,  the  estimate 
of  m from  Equation  (D-7)  is  truncated  at  the  decimal  point,  yielding  m-1. 

m-2 

Next  2 p is  computed  from  the  estimate  of  nv*l,  (see  Paragraph  3,  above). 
x«k 

It  is  not  necessary  to  oonpute  the  second  integral  of  Equation  (D-6),  since 


m-1  m-2 

k Px  " Vl  ^ I Px 

x«k  * * x«k 


(D-8) 


and  both  menibers  of  the  right-hand  side  already  are  avedlable. 

If  the  inequality  (D-6)  holds,  the  problem  is  solved.  If  not,  the  estimte 
of  m-1  is  adjusted  by  unity  and  the  last  process  repeated.  (Only  p or  p «,  as 
the  case  may  be,  need  be  conputed.)  ^ 
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S.  CQMPUnNa  THE  ’’BEST  ESTIMATE  OF  x.  IHE  NUMBER  OF  DEFECTIVES,” 


Th«  problOTi  i«  siireUar  to  that  diaouased  in  Paragraph  4,  above. 

Firat  approximationa  to  R and  R are  obtained  by  uaing  the  method  of 
Appendix  C*  Paragreiph  4»  to  ocnpute  l\  and  then  tranafonning  the  varlablea. 

Several  valuea  of  both  the  slnple  and  oumulative  probabilitiea  ^ oonputed 
for  argumenta  near  the  eatimatea  of  and  R^ . The  raaulta  are  tabulated  and 

inapeoted.  The  aiiiple  reotanglea  are  diaearded  one  at  a time , Jjsinnins  ^]Jh 
the^^amalleat  in  area.  The  prooesa  atopa  When  one  more  atap  would  reduoe  the 
remaining  integral  (area)  to  leaa  than  the  value  of  L. 

The  "maxintum  likelihood  eatimate"  ia  merely  the  v^ue  of  x aaBOoiated 
with  the  talleat  rectangle . 


ESTIMATION  AND  PREDICTION  OF  CONFIOENCED 
RELIABLE  LIFE  FROM  SMALL  SAMPLE  SIZES 

Eugftnft  E.  Coppols 
Benet  Weapons  Laboratory 
Watervilet  Arsenal 
WatervHett  New  York 

RELIABLE  LIFE  AND  ITS  LOWER  CONFIDENCE  BOUND.  Reliable  life  Is  that 
time  S during  which  a specified  proportion  R of  a population  of  devices 
will  operate  continuously  without  failure.  The  proportion  R Is  called 
the  reliability.  Reliable  life  Is  Important  for  devices  that  fall  cata- 
strophically. that  Is,  failure  of  the  devices  generally  results  In  dls- 
tructlon  of  the  devices  and  possibly  destruction  of  surrounding  equipment 
and  Injury  or  death  to  operating  personnel.  Cannon  components  such  as 
tubes,  breeches  and  chambers  fall  Into  this  category. 

Since  such  catastrophic  failures  mutt  be  avoided.  It  Is  Important 
that  the  device  be  operated  only  during  the  time  when  the  probability  of 
failure  la  low.  The  reliable  life  for  a now  device,  however.  Is  not 
known  and  hence  It  must  be  estimated  from  tost  data.  In  addition,  for 
gun  components,  a confidence  requirement  Is  added.  Namely,  It  must  be 
shown  with  a specified  confidence  level  C that  the  actual  reliable  life 
exceeds  a given  value.  In  other  words,  what  we  want  Is  a lower  confi- 
dence bound  ? at  confidence  level  C for  the  reliable  life  S.  The  lower 
confidence  bound  will  be  called  lower  confidenced  reliable  life  (LCRL). 
Note  that  when  applied  to  cannon  components,  reliable  life  Is  usually 
called  safe  life. 
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The  testing  of  cannon  components  1s  quite  costly  and  time-consuming. 
Consequently  economic  and  time  considerations  greatly  limit  the  number 
of  components  that  can  be  tested.  This  number,  the  sample  size.  Is  gen- 
erally around  six.  although  In  some  Instances  It  has  been  as  low  as  three 
and  as  high  as  20.  If  the  reliability  were  low.  then  this  restriction  of 
sample  size  would  be  relatively  unimportant.  Howevur.  the  reliability  for 
cannon  components  Is  generally  specified  to  be  99.9  per  cent.  Further, 
the  confidence  C Is  generally  specified  to  90  per  cent.  On  first  glance, 
one  might  Imagine  that  the  smallness  of  the  sample  size  would  give  highly 
undesirable  results  In  calculating  S.  This  Is.  however,  not  always  the 
case,  as  we  shall  see  below. 

THE  LOGNORMAL  DISTRIBUTION.  Because  of  the  smallness  of  the  sample 
size,  non-parametric  methods  do  not  give  satisfactory  results.  Conse- 
quently, It  Is  necessary  to  assume  that  the  failure  times  follow  a dis- 
tribution of  known  mathematical  form.  The  lognormal  and  Wei  bull  families 
of  distributions  are  usually  used  for  this  purpose.  In  this  paper,  we 
will  restrict  ourselves  to  the  lognormal  family. 

A real -valued  random  variable  X Is  said  to  follow  a lognormal  dis- 
tribution of  X Is  positive  with  probability  1 and  log  X follows  a normal 
distribution.  The  normal  distribution  of  log  X will  depend  the  two  usual 
parameters , VI  and  o , defined  by  u > E (log  X)  and  ^ * Var  (log  X).  These 
two  parameters  are  also  the  parameters  of  the  lognormal  distribution  of  X. 

In  terms  of  the  parameters,  the  reliable  life  S Is  given  by: 

S “ exp  (u-osj^) 


■■4 


I 

-j 
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where  Zj^  Is  the  100  R'th  per  cent  point  of  the  standard  normal  distribution. 

Assume  that  m specimens  have  been  tested  to  fallurei  with  the  failure 
times  being  x^t  •••  %•  We  further  assume  that  the  specimens  were  randomly 
selected  from  the  population  and  that  they  are  Independent.  Then  the  maxi- 
mum likelihood  estimates  (MLE's)  of  Ui  o and  S are  given  by: 
fl  - ” s log  X 2:  (log  x.-0)2 

A 

£j  ■ e,\p(p-9Zj^) 

The  LCRL  luj  1$  given  by: 

? " exp(fl-dK^) 

Where  1*  a tolerance  factor  that  depends  on  m.  R and  C.  Values  of 
for  various  mi  r and  c have  been  tabulated  and  are  readily  available  In  the 
statistical  literature.  Note  that  since  we  are  most  Interested  In  examin- 
ing the  sensitivity  of  the  LCRL  to  the  sample  size  mi  we  have  added  a 
subscript  on  to  the  LCRL  notatlonlm  to  emphasize  that  Sm  Is  being  calculated 
from  a sample  of  size  m. 

STATISTICAL  PROPERTIES  OpTm.  To  eliminate  the  parameter  p,  we  con- 
sider Sm/S  rather  than  Sm.  The  distribution  of  WSi  In  fact,  does  not 
depend  on  the  parameter  u i It  does,  however,  depend  on  m,  R and  C and  the 
parameter  a . Now,  o Is  generally  not  known.  However,  from  past  experience 
It  appears  that  for  cannon  tubes  and  breeches,  o will  be  between  0 and  0.3 
In  the  vast  majority  of  cases,  with  an  average  value  of  about  0.2.  The 
expected  values  of  Sm/S  and  (Sm/S)^  are  given  by: 
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r(^) 


)2e-'' 


2 

du 


2exp[208p+2o2(i +2L2)]  .••  , 

E(3)2 S 2! SL_  (u-2aL  )2e“’^“du 

JgoL  ™ 


- 1/2 

where  “ (2m) 

By  evaluating  these,  the  mean  and  variance  of  'Srn/S  may  be  determined. 

Table  1 shows  the  expected  value  of  im/S  for  R ■ 0.999.  C ■ 0.9  and 
for  various  a and  m.  Since  we  would  like  §m  to  be  close  to  S.  values  of 
E(Sm/S)  close  to  1 are  most  desirable.  However,  since  ?m:£S  with  prob- 
ability  C.  we  should  have  E (Sm/S)  <1.  As  can  be  seen  from  Table  1. 

E (Sm/S)  Is  much  smaller  than  1 for  very  small  m.  For  example,  for  o ■ 

0.2  and  m ■ 3,  E (?m/S)  Is  approximately  50  per  cent.  This  means  that 
on  the  average.  Sm  will  only  be  half  as  large  as  S.  For  the  developer, 
this  means  that  If  a policy  were  adopted  that  LCRL  ware  to  be  based  ex- 
clusively on  samples  of  3.  then  the  developer  would  have  to  insure  that 
on  the  average  the  actual  reliable  life  of  the  equipment  be  twice  as  large 
as  the  reliable  life  he  desires  to  demonstrate.  For  this  reason  alone,  a 
policy  of  basing  LCRL  on  samples  of  3 1$  highly  undesirable. 

It  should  be  no  surprise  that  m ■ 3 gives  undesirable  results.  What 
Is  surprising  Is  that  for  m not  much  larger  than  3,  the  results  are  not 
too  bad.  The  author  finds  It  remarkable,  for  example,  that  for  m 10, 
one  will  obtain "sm  on  the  average  about  77  per  cent  of  the  actual  S. 
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Table  2 shows  the  variance  of  Sm/S.  A variance  near  0 Is  most  desir- 
able. As  can  be  seeni  the  variances  for  very  small  m are  relatively  far 
from  0.  For  <*  ■ 0.2.  the  variance  Is  fairly  small  for  m?6  and  changes 
relatively  little  with  Increasing  m.  Use  of  any  of  the  standard  Inequali- 
ties such  as  Chebyshev's  Inequality  shows  that  even  for  relatively  small 
m.  Sm/S  will  tend  to  be  fairly  close  to  Its  expected  value. 

Hencei  the  restriction  to  small  m.  whole  not  Ideal  from  a statistical 
viewpoint.  Is  not  especially  damaging  either,  provided  m Is  not  too  small. 
In  fact  when  o Is  close  to  0.  the  LCRL  will  have  quite  good  properties. 

THE  EFFECT  OF  INCREASING  SAMPLE  SIZES.  Although  In  some  cases  small 
m may  give  acceptable  results.  In  other  cases  small  m may  not  be  as  desir- 
able. Let  us  Investigate  the  following  question:  m specimens  have  been 
tested.  Uhat  would  happen  If  we  tested  an  additional  k specimens  and  added 
them  to  the  sample  to  give  a sample  size  of  m t k? 

. The  question  Is  of  more  than  academic  Interest.  For  example.  Table  3 
shows  the  probability  that  by  adding  one  more  specimen  to  the  sample,  we 
can  Increase  LCRL.  As  can  be  seen.  It  Is  likely  that  LCRL  will  Increase. 


If  the  amount  of  Increase  Is  sufficiently  large.  It  may  be  worthwhile  to 
test  one  or  two  more  specimens. 

Assume  that  we  have  a sequence  X^.  X^.  X^.  ..  of  Independent,  randomly 
selected  failure  times,  For  each  m.  let 


“j-1  J 


1 ^ 

0;^  • i E (log  x.-p  )2 
m ™ J m 


S ■ exp(0  -a  K ) 
n in  in  tti 


ass 


Just  the  MLE's  and  LCRL  that  we  would  calculate  from  the 
first  m failure  times.  To  see  the  effect  of  adding  k additional  specimens 


to  a sample  of  nii  we  want  to  study  S,^  in  relation  to 


Define 


T. 


m.k 


m 


The  distribution  of  Tm.k  does  not  depend  on  either  m or  o • so  that  Tm»k 

can  be  used  whatever  the  actual  values  of  these  parameters.  A knowledge 

of  the  distribution  of  Tm^k  is  useful  for  the  following  reason:  Once  m 

specimens  have  been  tested,  we  can  calculate  I'm  and  ^m.  The  only  unknown 

quantity  in  the  definition  of  Tm.k  is  ?m+k.  Consequently,  probability 

statements  concerning  Tm.k  can  be  translated  into  probability  statements 

concerning 'Sm+k.  In  particular,  we  can  construct  prediction  intervals  for 

'Sm+k  in  terms  of 'Em  and  m.  as  follows;  Assume  p (0<pSl)  is  given  and 

that  we  have  determined  two  numbers  t-|  and  such  that 

' ■ 

This  last  equation  is  equivalent  to; 

Pr(3..xp(«„t^)  t i . p 

Consequently,  ( ' will  be  e prediction  Interval 

for  Stti+k  at  level  p. 

An  interesting  fact  about  Tm.k  Is  that  with  probability  1.  Tm.k  is 
bounded  from  above  under  a certain  mild  condition.  In  fact,  T can  be  written 
in  the  following  form: 
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wh»»re 


/ k .1/2 
'm+K' 

! m a/g.> 

^m+k^  \+k 


z Is  standaird  normal,  x Is  chi-squared  with  k*!  degrees  of  freedom, 
y is  chi-squared  with  m-1  degrees  of  freedom  and  x,  y,  and  z are  inde- 
pendent. (Note:  we  assume  here  that  2.  We  allow  k > 1 and  interpret 
a chi-square  variate  with  0 degrees  of  freedom  as  a random  variable  which 
takes  the  value  0 with  probability  1).  The  function  given  in(i)above  will 
take  a maximum  value  if  C^,  and  in  this  case,  the  maximum  value  is 


1/2 


ffl+k  ra 


The  condition  Cg"?  is  equivalent  to: 

(2) 


When  this  inequality  is  satisfied,  the  maximum  value  will  be 
m,k  ra  ” ' m+k 

Consequently,  when  inequality  2 is  satisfied, 

1 


This  is  equivalent  to: 


“ra+k  “ with  probabil i ty  1 . Inequality(2)will 

not  bo  true  for  all  m and  k.  (In  fact,  as  x « , the  left 

side  of  (2)  approaches  while  the  right  side  approaches  « . Hence, 
inequality(2lwill  not  be  true  for  large  k).  However,  (2)  will  be  true  for 
m and  k of  interest  in  this  paper.  Fcr  example,  when  m - 6,  R " 0.999  and 
C ■ 0.9,  inequal1ty(2)will  be  true  for  k^  50. 

The  distribution  of  Tm,k  for  R ■ 0.999,  C 0.9  and  for  some  m and 
k have  been  determined  by  Monte-Carlo  simulation.  The  cumulative  dis- 
tributions of  Tm,k  for  m “ 6 and  k - 1,  2,  3 and  for  m ■ 10  and  k ■ 1 , 2, 
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3 ar«  shown.  Note  that  for  w • 10,  the  bulk  of  the  distribution  Is  con- 
centrated near  0,  so  that  adding  up  to  3 more  samples  to  an  already- 
existing  sample  of  10  will  probably  not  produce  much  change  In  LCRL.  For 
m » 6,  the  distribution  Is  not  so  closely  concentrated  near  0.  However, 
depending  on  the  actual  numbers  Involved,  the  prediction  Intervals  may  be 
fairly  tight. 

An  Example 

As  an  example,  consider  the  following  six  failure  times:  2596,  2536,  2811, 
2141,  2416,  2839.  We  calculate  from  these: 

- 7.841 
» 0.09493 

• 1427,  for  R • 0.999,  C - 0.9 

Now  suppose  that  the  original  test  plan  is  to  test  8 specimens,  of  which 
the  first  6 gave  the  failure  times  above.  Then  prediction  Intervals  for 
$7  and  Sg  at  a level  of  90X  are 
S;  1305,  1554 
Sg  1265,  1649 

The  figures  on  the  right  represent  the  upper  bounds  mentioned  above.  That 
Is,  with  lOOX  confidence,  Sg£  1649. 

Now,  If  the  original  aim  of  the  test  was  to  demonstrate  a reliable 
life  of  2000,  It  Is  clear  that  this  will  be  Impossible.  For  after  all  8 
specimens  have  been  tested,  the  LCRL  cannot  be  higher  than  1649.  Conse- 
quently, the  testing  can  be  halted  after  6. 

Suppose  Instead  that  only  1500  reliable  life  was  desired.  From  the 
distribution  of  Tg^2>  can  calculate  that  1500  Is  a lower  prediction 
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bound  of  Sg  at  a lave!  of  approximately  70  par  cent.  One  can  therefore 
be  fairly  confident  the  test  will  show  at  least  1500  reliable  life.  On 
the  other  hand,  If  only  1250  reliable  life  were  desired,  then  one  can  be 
about  90*  confident  that  the  final  results  will  show  a reliable  life  of 
at  least  1250. 
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SEQUENTIAL  ALLOCATION  OF  OBSERVATIONS  IN  THE 

e:eponential  selection  problem 


Reb«ri  M.  Hharteiit  Ph.D.  H.  Srlnlvasan,  Ph.O. 

Thoiuii  Jtffanan  Univtnity  Ttmplt  Unlvtvtity 

?hllad»lphift,  PtnntylvaniA  PhlltdtlphU«  PannaylvaaU 


APSTRAdf.  Two  ••autntial  daift-dopondont  allocation  luloo  for  aoaifning 
patitnia  to  ollnieal  trials  art  sxplortd  in  this  paper.  Ths  ohjeotivs  of  ths 
designs  is  to  test  ths  null  hypothesis  that  thors  is  no  diffsrsnos  in  mean 
survival  tines  assooiated  with  two  treatnsnta  where  survival  tine  is  assumed 
to  be  exponentially  distributed  and  at  the  same  tine  to  minimise  the  number 
of  patients  assigned  to  the  inferior  treatment.  Both  the  single  patient  and 
multiple  patient  entry  oases  are  disoussed. 

1.  INTRODUCTION.  This  paper  is  oonoesned  with  protocols  for  ollnioal 
trials  in  which  wo  desire  to  determine  whether  there  is  any  differenoo  in  the 
off sots  of  two  treatments.  We  will  base  our  deoision  on  some  measurable 
response  assooiated  with  a treatment  (i.e.  survival  time»  time  to  remission, 
eto.).  Vs  also  assume  that  patients  arrive  for  treatments  sequentially  in 
tins  either  individually  or  in  groups. 

Most  elinioal  trials  addressing  this  question  require  approximatsly 
equal  numbers  of  patients  to  be  assigned  to  eaeh  treatment.  New  suppose  it 
becomes  olear  to  the  treating  physioian  that  one  treatment  is  better  than  the 
ether  before  suffioient  patients  have  been  aeorued  to  reaeh  a deoision  with 
ths  signif loanee  and  power  speoified  in  the  original  trial  design.  He  then 
faces  an  ethieal  preblen.  He  ean  net  eontinue  to  treat  patients  with  an 
inferior  treatment  and  yet  by  terminating  the  trial  prematurslyi  he  may  lose 
information  which  would  be  invaluable  in  planning  the  treatment  of  many  future 
patients. 

To  reduce  this  ethieal  problem,  it  would  be  useful  to  design  the  ollnioal 
trial  using  the  data  oolleoted  up  to  a given  point  to  choose  the  treatment 
for  a patient  entering  the  trial  at  that  point.  The  aim  being  a design  whinh 
tends  to  assign  the  majority  of  the  patients  to  the  superior  method  of  treat* 
ment,  while  meeting  ths  classical  statistioal  criteria  of  significance  and 
power. 

Per  tifie  exponential  selection  problem  in  which  wo  test  the  null  hypothesis 
that  there  is  no  differenoo  in  mean  survival  timss  assooiated  with  the  two 
treatments  where  survival  times  are  assumed  to  be  exponential  with  parameters 
(death  rates)  depending  upon  treatment,  Plehinger  and  Louis  (1971)  have  investi- 
gated a whols  range  of  sequential  data-dependent  assignment  rules  ranging  from 
strict  alternation  of  troatment  to  assignment  of  treatment  with  lower  estimated 
death  rate.  Clearly  the  most  data-dependent  alloeation  rule  would  be  to  assign 
the  next  patient  to  the  treatment  with  the  smalleet  expected  death  rate 
(maximum  likelihood  estimate).  Unfortunately  thia  rule  would  often  have  the 


311 


•ffaot  of  alloeating  an  ovorNholmlng  pieportlon  of  pationts  to  ono  or  tho 
ethor  troatnont  and  thua  axtanding  tho  longth  of  tho  trial  Indofinatoly 
(Aznitago  1975) • To  roduoo  thia  dlffloulty  Plohihgor  and  Leuia  havd  pro* 
poaod  tho  following  rango  of  alloeatlon  ruloat 

Lot  Di-  m tho  nunbor  of  doatha  of  pationta  troatod  by 
mathod  i by  tlmo  n 

Ti  ■ tho  total  tlmo  livad  by  pationta  troatod  by 
mothod  1 by  tlmo  n 

Y « bo  a eonotant  botwoon  0 and  1 
than  at  tlmo  n 

a)  If  |Di„  - BjJ  . vn  aad  troatmont  1 la  uaod  idioroaa 

If  troatmont  2 la  uaod 

“ pm  - “sal  4 "m^*m  S 

Troatmont  1 la  uaod  whoroaa  If  / T^^^  > / T^^  troatmont 

2 la  uaod. 


2<  A^OATION  RULIS.  Mo  vlah  to  oxamlno  two  furthor  alloeatlon  ruloa. 

Tho  flrat  troata  tho  auio  oltuatlon  aa  tho  Flohlngor-Loula  ruloa  (l.o.  oxponontlal 
aurvlval  tlaot  pationta  arriving  aoquontially  over  a porlod  of  tlmo  and  bolng 
aaalgnod  Immodlatoly  to  a alnglo  troatmont.)  Thlo  allocation  rule  idiloh  wo  will 
rofor  to  aa  R1  aaalgna  troatmont  to  tho  noxt  throo  Incoming  pationta  baaod  on 
aooumulatod  data  with  two  of  tho  pationta  rooolvlng  troatmont  1 if  It  haa  tho 
aaalloat  oxpootod  doath  rato  and  ono  patlont  thon  rooolvlng  troatmont  2 and 
tho  rovorao  If  troatmont  2 la  aaaooiatod  with  tho  lowoat  oxpootod  doath  rato. 

If  tho  two  troatmonta  havo  tho  aamo  oxpootod  doath  rato  thon  tho  troatmont 
given  to  two  of  tho  noxt  throo  pationta  la  rovoraod  from  that  of  tho  provloua 
triple  of  pationta.  Thla  rolativoly  almplo  rule  ovoreomoa  tho  dlffloulty  of 
an  ovorwholming  proportion  of  pationta  going  to  any  ono  troatmont  and  la 
oomparablo  to  tho  Plohlngor-Loula  ruloa  with  roapoot  to  Avorago  Sample  Nunbor 
(a.B.N.)  and  Inferior  Troatmont  Number  (X.T.N.). 

It  alao  haa  tho  advantage  that  It  om  bo  extended  In  a natural  way  to  tho 
oaao  of  multiple  patlont  entry.  Wo  will  only  oonaldor  hero  tho  oltuatlon  whore 
throo  pqtlonta  arrive  for  treatment  every  third  day  but  tho  auggoatod  approaoh 
oan  oaally  bo  extended  to  a more  gonoral  aottlng.  In  tho  throe  patlont  entry 
oaao  aa  In  tho  above  allocation  rule  wo  aaalgn  two  of  tho  noxt  throo  pationta 
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Mill’ 


to  tho  troatment  whioh  haa  tha  amallaat  axpaotad  daath  rata  on  tha  baaia  of 
aeounulatad  djita>  Wa  ahall  rafar  to  thla  allocation  rula  aa  R3  Tha 
proteeola  for  elinietl  trial*  utilisiag  Ri  and  R3  art  opan  aa^uantial 
doaiicn  i«hiah  taminata  ahan  tha  likalihood  ratio  oroaaaa  a given  boundry. 
Thia  ii  alae  trua  of  tha  Plahingar-Lotiia  rulaoi  Qoapariaona  of  Rl  and 
R3  with  two  allooation  rulaa  ddnotad  R2  and  involving  atriot 
altamation  ara  praaantad  in  Saotion  3<  Thaaa  raaulta  hava  baan  obtainad 
on  tha  baaia  of  ooaputar  aiaulation  with  1000  raplioationia  for  aaoh  entry > 

3.  MBglKITIONB  ARP  SIHUUTION  RB8ULT8. 

Hvnathaaia  undar  aonaldarationi 

It  ia  aaaumad  that  there  ara  two  traataanta  availabla*  A patient  ia 
given  one  of  thaaa  traatmanta  at  a point  in  tima»  after  whiOh  hia  remaining 
life  length  haa  an  exponential  diatributioni  tha  daath  ratea  and  ^2 

depend  upon  tha  traatmanta.  Tha  olinioal  trial  ia  intandad  to  ohooaa  one 
of  tha  following  hypothaaaai 

“o ' S’  ”i  * S*  *'^1  ' S ' 

where  k > 1 ia  ohoaan  in  advanea  aa  a ratio  idiioh  rapraaanta  a madioally 
aignifioant  diffaranoa. 


Alloamtlon  Rulaaj 

For  any  given  tine  t , after  tha  trial  bagina. 
Let 


Sjt  be  tha  tine  lived  ainoa  treatment  if  ha  ia  atill 
alive  for  a patient  given  treatment  i at  time  J. 


Y.  4*  be  tha  time  lived  from  treatment  to  daath  if  ho  haa  died, 
ijt 

be  tha  number  of  daatha  of  patient*  treated  by  method  i by  time  t. 


be  the  total  tin*  lived  by  patianta  traatad  by  method  i by  time  t. 

1 w 

tfa  oonaidar  four  allooation  rulaa  denoted  Rl  thru  • 

Rl  Tha  patient*  arxiva  on*  par  day  but  tha  treatment  plan  for  the  next  three 
(3)  daya  la  daflnad  ovary  third  day  by  randomly  aaalgning  on*  of  the  two 
traataanta  to  two  of  th*  patianta  and  the  othar  traatmant  to  tha  ranuning 
patient  in  th*  triple.  Vlhieh  traatmant  ia  uaed  twio*  ia  datarmined  by  th* 
following  rulai 
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irtiitncnt  1 !■  used 

irtaiatni  2 !■  UMd 

if  ^ ^It*  ^ ^2t  «**•"«•  prtvlou*  »«il«nmint. 

R2  tt  almpl*  altainatlon  of  tnatnont  on*  and  twoi  with  tho 
triataont  for  tho  flnt  patient  mndottly  aoloetod. 

R3  Throe  patlonti  arrive  on  the  firet  day  and  every  third  day  thereafter. 
The  treataent  reoeived  by  the  aajorlty  of  the  next  triplet  ie  determined 
by  the  eaae  rule  need  in  Rl. 

X4  The  patient!  arrive  ae  in  R3.  Treatment  one  or  two  ie  randomly  aolooted 
tad  thia  treatment  la  randomly  aaaigned  to  teo  of  the  3 patienta  with 
the  third  patient  reoeiving  the  other  treatment.  The  treatment  aoheme 
la  revoraed  for  the  next  3 patienta. 

Teaitwattflfi 

The  termination  rulea  oonaidered  utilise  the  likllhood  ratio's 

‘it*  * ” ( (tit*  »2t)  ! <^lt*  »«)  > 

( (T,^+  T„)  / T„)  ) (‘it*  »2t> 

and  are  of  the  foxni  aeleet  two  numbers  A and  B with  A < 1 < B. 

» ^2t^  < A -►  terminate  and  aooept 

Hax  (L^^i  l2^)  > B Terminate  and  aooept  where  i eorroaponds 
to  the  larger  of  L2^. 

A < max  L2^)  < B -*■  continue  testing. 

nehinger  and  Louis  (1971)  showed  that  for  k«2,  A>.l  and  Bn 30. 
give  a signifieanee  level  of  .03  and  a power  of  .95  . These  values  were 
used  in  the  results  that  follow. 

The  authors  are  currently  working  on  more  extensive  computer  simulations 
of  the  sohemes  presented  here  for  the  exponentll  and  similar  results  for  tho 
normal  ease.  The  implloations  of  introducing  further  randomisation  and  its 
effect  on  seleotion  and  trend  bias  are  also  being  explored. 
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Maximum  Likelihood  Estimation  of  12D 


for  Inooulatsd  Packs 
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Abstract.  This  paper  describes  a statistical  procedure  for 
estimating  the  12D  dose  In  the  radiation-sterilization  of 
canned  food,  using  data  from  an  inoculated  pack  experiment. 
The  method  assumes  a two-parameter  distribution,  of  which 
the  shifted-exponential  Is  taken  as  a prototype,  and  uses 
the  maximum- likelihood  principle  to  estimate  the  parameters 
and  hence  12D.  The  procedure  la  ombodiad  in  a computer  pro- 
gram which  estimates  12D  and  provides  confidence  limits  on 
both  12D  and  the  kill  at  zero  dose.  The  method  Is  illustrat- 
ed by  an  example. 

1*  Introduction.  This  paper  is  concerned  with  methods  for 
assessing  the  erfectlveness  of  Ionizing  radiation  as  a moans 
of  food-preservation.  In  particular,  it  deals  with  the  prob- 
lem of  estimating  the  12D  dose,  using  the  data  obtained  from 
an  Inoculated  pack  experiment.  A number  of  papers  have  dealt 
wholly  or  partly  with  this  question,  Including  those  of 
Anellls  et  al.  (1968,  1969,  1975),  Qreoz  et  al.  (1971)  and 
Ross  (1974,  1976).  The  general  problem  Is  one  of  determining 
a dose-response  function  and  Is  discussed  by  Finney  (1952). 

The  purpose  of  the  present  paper  Is  to  describe  a method 
of  data  analysis,  based  on  the  maximum-likelihood  (ML)  cri- 
terion, for  estimating  12D  from  inoculated  pack  data.  An 
example  will  be  presented,  showing  how  the  method  works. 

The  ML  method  Is  a very  wldely-used  procedure  for  deriv- 
ing estimates  of  unknown  parameters  from  experimental  data 
and  Is  described  in  most  books  on  mathomaclcal  statistics, 
e.g.,  Hoel  (1971).  it  seems  not  to  have  been  applied  in 
analysing  Inoculated  pack  data,  possibly  because  It  leads  to 
complicated  formulas  that  can  in  praotloe  only  be  solved 
with  the  aid  of  a high-speed  computer.  Despite  this  draw- 
back the  ML  method  is  worth  considering  because  it  can 
extract  more  useful  information  than  other  procedures  from 
the  same  data. 
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2.  Theory.  This  section  is  divided  into  two  parts,  dealing 
with  the  inoculated  pack  experiment  and  a description  of  the 
ML  method. 


2a.  The  Inoculated  Pack  Experiment.  The  inoculated  pack 
experiment  consists  of  inoculating  cane  of  the  food  sub- 
strate with  a large  nxunber  of  the  test-microorganisms.  The 
cans  are  then  vacuum-sealed,  groups  of  them  are  exposed  to 
different  doses  of  radiation  and  then  incubated.  After  in- 
cubation each  can  is  examined  to  see  whether  it  contains  sur- 
vivors. In  the  example  described  later,  the  test  microorgan- 
isms were  spores  of  ten  strains  of  Clostridium  botulinum,  the 
incubation  period  was  six  months  and  the  method  of  examination 
was  the  recovery  of  viable  botulinal  cells. 

If  we  denote  the  different  groups  of  cans  by  index,  i, 
i “ 1,2,...,  M,  we  define 


X “ dose  which  the  i-th  group  received. 


n^  * ntimber  of  organisms  per  can  in  the  i-th  group. 

■ number  of  cans  in  the  i-th  group. 

K.  ■ number  of  sterilized  cans  (i.e.,  cans  without  sur- 
^ vivors  in  the  i-th  group. 


Usually  the  experiment  is  designed  so  that  all  n,  are  approxi- 
mately equal,  and  all  N are  the  same.  This  simplifies  the 
experiment  and  analysisr  but  there  are  advantages  to  be  gained 
by  varying  n and  N..  In  any  case,  the  procedure  described 
here  applies^to  the^  situation  where  n,  and  N,  may  all  be  dif- 
ferent. 


The  data  consist  of  x^,  n^ , and  for  i ■ 1,2,...M, 

where  x.  and  n^  are  non-negative  numbers  and  and  are 
negative  integers.  The  data-analysis  must  deduce  an  estimate 
of  the  12D  dose  from  this  data. 


2b.  The  ML  Method.  The  method  described  here  is  baaed  on 
the  general  probability  theory  for  inoculated  packs,  see  Ross 
(1974)  . 

It  is  assumed  that  under  the  test  conditions,  the 
probability  that  an  individual  organism  wi31  bo  killed  at 
dose  X is  given  by  the  diatribe,  c function  G(x),  the  sur- 
vival probability  being  l-G(x).  The  12D-dose,  which  we  denote 
x_,  satisfies 

w 
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1-0  (X  ) - 1 X 10“^*  (1) 
c 

The  probability  that  a can  containing  n organisniB  will  be 
eternized  (l.e.»  all  organieme  will  be  killed)  at  dose  x 
le  denoted  by  4(x).  4(x)  and  G(x)  are  related  by  means  of 


®(x)  - [G(x)  ] 

or«  approximately  for  n large  and  1-G  small, 


♦ (x)  ■ ^ 


G(x)  ■ 1 + n“^  in  4 (x) 


In  the  inoculated  pack  test  at  dose  x , N cans  are 
exposud  oach  having  4(xj)  « as  the  probability  of 
sterilization.  The  probability  that  K,  cans  are  sterilized 
at  dose  is  given  by  the  binomial  distribution 


*i 


fl  - 


Ni  -K^ 


So  far,  nothing  has  been  assumed  about  the  form  of  the 
function  G(x).  We  now  assume  that  G(x)  has  a general  form, 

G (x>  Bi,  B2),  by  which  is  meant  that  G depends  not  only  on 
X (l.e.,  dose)  but  on  two  other  quantities,  and  Ba,  which 
are  independent  of  dose.  For  example,  assuming  a shifted- 
exponentlal  distribution  for  G(x)  means 

G (x)  ■ 1 - exp  {-Bj(x-Ba)} 

If  Bi  and  Hi  were  known,  we  could  immediately  estimate  12D 
by  solving  Equation  (1),  which  becomes  in  this  case 


1 - 1 X 10"“*  1 “ exp  {-Bj  (x^  “ B2)} 
or 

- Bi  + (12  In  10)/Bj  - Bj  + 27.63/B, 

Usually  we  do  not  know  Bi  and  Ba,  and  our  problem  is  then 
to  estimate  theta  from  the  inoculated  pack  data. 

The  ML  method  telle  ue  to  do  this  by  choosing  B^  and  Bg 
so  that  the  probability  (likelihood)  of  getting  the  observed 
experimental  results  is  as  large  as  possible.  If  B^  and  Bg 

are  given > then  4>^  is  known  and  is  the  likelihood  of  get- 
ting the  observed  outcome  at  dose  X{.  Since  the  cans  at  dif- 
ferent doses  are  tested  independently ^ the  joint  likelihood  of 
getting  the  observed  experimental  results  for  all  the  M doses 
is 


P ■ Pj  Pj  P^ 


The  ML  procedure  directs  us  to  find  B ^ and  Bg  so  that  P will 
be  maximized.  This  is  equivalent  to  maximizing 


M 

r-  In  P - EUn  C.  + K.  ln$,  + (N, -K. ) In (!-♦  J } (6) 
i»l  111  11  i 


where 


The  usual  procedure  for  finding  Bi  and  Ba  is  to  solve  the 
equations 


,3r/3Bi  • 0 


3r/3Bg  - 0 . 
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The  plausible  forms  for  G(x)  all  lead  to  equations  which  are 
too  complicated  to  solve  by  simple  formulas.  Usually  one 
uieir  instead!  a successive,  approximation  schemer  like  the 
one  written  in  matrix  form  as 

A K £ _ (r")“^r' 

(7) 

ar/BBg 


where 


$ ■ 


B V 


I ®2 


r" 


B^r/SBgDB^  32r/3B| 


In  using  thlSr  an  initial  guess  Is  made  for  and  Bi.  The 

terms  on  the  right  of  Equation  (7)  are  evaluated  for  those 
values  of  Bi  and  Ba  and  the  quantities  and  @2  are  calculat" 
ed  using  Equation  (7).  These  are  then  taken  as  the  new  values 
of  and  Bs  and  the  process  la  repeated.  This  continues 

until  the  B's  and  B's  are  equal  to  some  desired  accuracy. 

■ 

The  properties  of  this  Gauss-Newton  iteration  scheme 
are  reasonably  well-known.  It  converges  if  the  initial 
guess  is  good  enoughr  and  the  Hessian  matrix#  I'"#  is  positive- 
definite.  If  it  converges#  the  inverse  of  the  Hessian  Matrix# 
(r")“‘#  gives  the  estimated  variance-covariance  matrix  of  Bi 
and  Bj.  However,  the  method  may  occasionally  fail  to  converge. 

Given  any  assumed  form  for  G(x)#  one  can  write  explicit 
formulas  for  the  quantities  3r/3Bj#  BF/SB^#  3*F/9Bi*#  3*lV3Bi3Bj 
etc.  as  functions  of  Bi  and  Ba,  These  formulas  are  necessary, 
but  they  are  complicated  and  not  especially  informative,  so  we 
omit  them. 

The  calculations  involved  in  carrying  out  the  ML  method 
are  obviously  very  tedious.  However,  the  author  has  prepared 
a FORTRAN  computer  program  which  does  the  calculation  of  B. 
and  Bg.  then  finds  the  12D  dose,  x . The  program  also  finis 

confidence  limits  on  x and  the  logarithm  of  the  survival 
probability  at  x-O.  ° it  does  all  of  these  calculations  for 
eaoh  of  the  following  five  general  forms  of  G(x)s 
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shifted  exponential 


G(x)  - 1 - #xp{Bj(x-Bg)) 

B 

0(x)  ■ 1 - «vp/  “(B.x>  Welbull 

0(x)  ■ ^’g^BJ^(x-Bg)}  normal 

0(x)  ■ F {B.An(x/B2))  lognormal 

a ■* 

0(x)  ■ 1 - «xp(-B^x)  unshlfted  exponential 

where 

y (y)  ■ r 

* ) mm 


The  program  reoelves  the  Inoculated  pack  data  as 
Input,  including  doses  where  all  or  none  of  the  cans  are 
sterilized  as  well  as  partial  spoilage  doses.  Zt  first 
carries  out  least-squares  fitting  of  the  data  from  partial- 
spoilage  doses  only,  obtaining  in  this  way  Initial  estimates 
of  Bj  and  Ba  for  all  five  forms  of  G(x).  Those  are  used  to 
start  the  ML  method.  Having  found  the  optimizing  (i.e., 
maximizing)  values  of  Bi  and  Ba,  the  program  also  finds  for 
each  form  the  on^intity 


g <■‘1  - 

a »i*iU  - *i) 


This  is  distributed  approximately  as  a variable 

with  M-2  degrees  of  freedom  and  is  an  overall  measure  of  how 
well  that  form  can  be  made  to  fit  the  data. 
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3.  Example . In  this  Section  we  describe  an  Inoculated 
pack  ihat  was  recently  carried  out  at  the  U.  S.  Army 
Natick  R&D  Command  and  showa  the  resulte  of  using  the  pro- 
posed method  of  data  analysis. 

An  Inooulated  paok  was  done  at  -30°C  using  C.  botullnum 
spores  in  canned  pork.  The  data  (Anellisr  unpubTisnea , 
based  on  can-swelling)  are  shown  in  Table  1.  The  results 
from  the  preliminary  least-squares  (LS)  fitting  and  the  final# 
maximum  likelihood  (ML)#  estimates  are  shown  in  Table  2. 

Figure  1 is  a graph  of  the  data  points  and  the  four  distri- 
butions fitted  by  the  ML  method. 

Examining  the  ML  results#  we  see  that  the  normal  dis- 
tribution predicts  that  the  entire  95%  confidence  interval 
of  Z lies  below  -1#  l.e.  there  is  more  than  90%  kill  at  zero 
dose  with  95%  confidence.  We  therefore  discard  this  distri- 
bution. For  the  exponential  distribution  ML  predicts  a very 
small  shift  (Z  - -.23)  whose  95%  confidence  limits  straddle 
zero.  There  is#  therefore#  no  reason  to  conclude  that  the 
shift  is  non-zero#  which  means  that  in  this  case  the  simple- 
exponential  hypothesis  is  acceptable  (i.e.#  it  is  not  contra- 
dicted by  the  data).  Similarly  the  Weibull  shape  parameter 
is  very  close  to  1.0#  B.  ■ .9733#  which  also  supports  this 
hypothesis.  These  two  aistributions  give  almost  the  same 
12-D  dose#  X •*  3.B9#  and  the  12D-dose  of  a simple-exponential 
is  X ■ 3.83*  The  Schmidt-Nank  formula  yields  x ■ 3.76. 

The  ° lognormal  leads  to  the  estimate  x^  ■ 4.11.° 

The  theoretical  value  of  x*  i®  2^  **^  " xJ(*9S)  " 9.49# 

which  exceeds  the  computed  x‘  ®11  four  distributions#  so 
we  have  no  evidence  against  any  of  the  four  distributions  on 
grounds  of  goodness-of-flt. 


In  this  case  we  can  adopt  a procedural ly  conservative  | 

viewpoint  and  reason  as  follows.  In  the  past#  the  simple  i 

exponential  has  always  been  used.  The  data  does  not  refute  | 

its  use  here#  so  we  may  conclude  that  the  distribution  is  | 

exponential#  the  best' 12D  estimate  is  3.89#  and  the  95%  con-  ') 

fidenoe  limits  are  3.62  i x i 4.32.  An  alternative  algo-  I 

rithm  is  to  suspend  judgment°on  the  distribution  but  use  the 
largest  12D-value  given  by  any  acceptable  distribution.  j 

This  leads  to  use  of  the  lognormal  estimates#  ■ 4.11  and  1 

3.73^  Xg  f 4.77.  ° j 

Either  of  these  two  viewpoints  can  be  taken  in  this  | 

case  and  the  two  12-D  values  obtained  are  not  statistically 
different  at  the  95%  confidence  level.  Also  in  practical  | 

terms  the  difference  between  3.9  and  4.1  megarads  for  12D  | 

is  not  very  important. 
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4.  Discussion.  Ths  ML  method  has  the  following  advantages; 
(i)  It  is  a generally  accepted  statistical  procedure. 

(ii)  It  is  a very  flexible  method  that  can  be  used 
with  many  different  assumed  distribution  functions. 

(Ill)  Because  it  uses  the  data  at  points  where  ■ 0 
or  it  oomes  closer  than  existing  methods  to  using 
all  the  Information  that  is  in  the  data. 

' It  has  two  drawbacks,  namely,  it  is  complicated  and 
may  occasionally  fail  to  converge.  The  former  is  not  a 
problem  since  a computer  program  already  exists  for  it,  and 
the  latter  happens  very  rarely  in  the  writer's  experience. 

On  balance,  it  appears  that  the  ML  method  is  promising 
and  deserves  further  study. 
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951  oonfidenoe  ranges  for  theoretical  probabilitiea  at  dosea 
where  no  cans  or  all  cans  are  sterilized. 


CONFIDENCE  BOUNDS  FOR  THE  GENERAL  LINEAR  MODEL 
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In  this  paper,  for  the  general  linear  model  Y • X3  * e,  we  con> 
aider  the  construction  of  confidence  bounds  about  the  entire  regression 
line.  To  accomplish  this  we  exploit  a powerful  theorem  of  Scheffd.  A 
procedure  often  encountered  is  one  in  which  a set  of  confidence  inter- 
vals about  E(y|x)  or  prediction  intervals  for  future  observations  are 
determined  and  then  the  end  points  are  connected  in  such  a fashion  as 
to  describe  an  envelope,  The  belief  is  that  what  has  been  accomplished 
is  precisely  what  Scheffd's  theorem  allows  one  to  do. 

In  addition,  we  present  some  extensions  concerning  confidence 
bounds  about  combinations  of  regression  lines  and  suggest  a useful 
application  of  these  results.  Specifically,  we  propose  to  use  the 
confidence  bounds  about  the  difference  of  regression  lines  to  make  a 
quantitative  assessment  of  when  and  whero  independent  sets  of  data 
characterising  the  same  phenomena  are  in  agreement  or  disagreement. 
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1.  INTRODUCTION 


It  la  appropriate  at  the  onset  that  we  devote  a few  paragraphs 
to  the  Introduction  of  the  general  linear  h/pothesla  model  of  full 
rank.  We  want  to  consider  uncorrelated  observations  y\>  •••>/« 

that  satisfy  the  relation  ‘ " 

P 

Xi  “ ^ 2.  •••*  n Cl.l) 

and  are  linear  In  the  unknown  parameters  3^,  •••»  known 

coefficients  x^j  and  random  term  e^  satisfying 


and 


EC/i)  • 


2 

VarCy^)  • o .■ 


In  other  words,  the  random  term  e,  Lo  a random  variable  with  expected 

* 2 
value  E(e^3  equal  to  zero  and  unknown  variance  VarCe^}  equal  to  c . 

The  problem',  in  its  most  general  sense,  involves  determining  point  and 
Interval  estimates  of  several  quantities  of  interest  of  the  model  and 
the  testing  of  various  statistical  hypotheses. 


For  compactness  of  notation  and  ease  of  manipulation  let 


Vi' 

*11 

*12 

*lp 

f®l' 

'•il 

>^2 

*21 

Xj2  • • • 

*2p 

®2 

*2 

y - 
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, X ■ 

1 

* B - 

e 

1 

, e ■ 

1 

1 

^n 

1 1 

*nl 

*n2  ••• 

’‘np 

• 

'p 

• s 

1 

•n 

1 1 

then  we  can  write  the  system  of  relations  C1<1)  es 

Y - XS  + B 

and  proceed  to  define  the  general  linear  hypothesis  model  of  full  rank 
as  follows: 
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Definition  1.1.  The  model  Y • Xg  + e where  Y le  a random  observed 
vector,  e is  a random  vector,  X Is  an  n x p matrix  of  known  fixed 
quantities,  and  |S  Is  a vector  of  unknown  parameters  is  called  the 
general  linear  hypothesis  model  of  full  rank,  provided  the  rank  of  X 
is  equal  to  p where  p ^ n> 

In  the  present  Inquiry  we  restrict  our  consideration  to  the  nor- 
mal theory  case,  which  means  the  random  vector  e,  already  satisfying 

2 

B(e)  ■ 0 and  cov(e)  ■ a.,  will,  in  addition,  be  assumed  to  be  normally 
distributed.  ^ 

The  problem  most  frequently  addressed  Is  that  of  estimating  the 
unknown  parameters  on  the  basis  of  the  observations  y^.  These 

estimates  of  0^,  denoted  by  0j , are  functions  of  y^i  and,  as  such,  are 

themselves  random  variables  about  which  confidence  intervals  can  be 
constructed.  These  ideas  are  fully  developed  In  a number  of  text- 

books.  ' A point  not  so  widely  expounded  Is  that  the  usual  frequency 
Interpretation  of  a confidence  Interval  based  on  a single  sample 
/2'  ^n  single  coefficient  0ji  if  the  same 

data  are  used  to  determine  confidence  Intervals  for  both  0^  and  0j, 

i ^ J , the  probability  is  not  1-a  that  the  confidence  Intervals  tfius 
constructed  will  simultaneously  contain  0^  and  0j.  The  complexity 

Is  advanced  by  the  fact  that  the  interval  estimates  are  not  Independents 
so,  in  general,  only  a single  confidence  statement  can  be  made  from  a 
single  sot  of  observations. 

It  Is  not  our  Intent  here  to  address  this  problem  directly;  such 
an  Inquiry  falls  Into  the  general  area  of  simultaneous  confidence 
Intervals.  It  Is  our  intent,  however,  to  consider  a ramification  of 
this  problem!  namely,  the  construction  of  a confidence  envelope  about 
the  entire  regression  line.  We  will,  in  addition,  provide  some  results 
concerning  confidence  envelopes  about  combinations  of  regression  lines 
and  Implications  of  their  use. 

1 

Toward  this  end  consider  the  following  definition  due  to  Bose! 


1 Qraybill,  P.  A.,  An  Introduction  to  Linear  Statistical  Models. 
Volume  I.  McGraw-'nriTTook  cfompauy , !ncTT  Ww  York ,1961 , 

2 kao,  C.  R. , Linear  Statistical  Inference  and  Its  Applications. 
John  Wiley  0 Sons,  tnc. , How  York,  196hi 

Bose,  R.  C.,  "The  Fundamental  Theorem  of  Linear  Bstlmatlon", 
Proceedings  of  the  31st  Indian  Science  Congress,  1944,  pp.  2-3. 
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Definition  1.2,  k parametric  function  la  called  an  estimable  function 
it  it "has  an  unlJiaa ed  linear  estimate,  l.e.,  if  there  exists  an  n-vector 
a of  constant  coefficients  such  that  BCa'/)  > ip. 

If  1,  is  a p-dlmenslonal  space  of  estimable  functions  with  basis 

A 

<1>2»  •••*  'I'p}  't'  the  least  squares  estimate  of  ip  c L,  then 

we  have  the  following  theorem  due  to  Scheffd^. 

Theorem  1,1.  Under  the  general  linear  hypothesis  model  (normal  case) 
the  probability  is  1 - a that  simultaneously  for  all  ip  e L 


Ip  « So»  < Ip  <,  Ip  + So. 

ip  ip 

where  the  constant  8 {p?t^Cp,n-r) and  rank  X ■ r. 


The  implications  of  this  theorem  are  far  reaching;  and  in  this 
article  we  will  exploit  a single  facet,  albeit  an  Important  and  useful 
one.  To  facilitate  this  wo  need  to  bo  aware  of  the  fact  that  since 

A 

least  squares  estimates  $ ara  BLUB,  tho  elomonts  of  the  vector  B of 
the  general  linear  model  of  Kill  rank  form  a basis  of  a space  L of 
estimable  functions  which  includes  polynomials  as  a apodal  ease. 


2.  CONFIDBNCB  RKGION  FOR  A POLYNOMIAL 

To  determine  a confidence  region  for  a polynomial  with  observa> 
tional  equations 

y^  “ Bq  + + BjxJ  + ...  + + e^,  i • 1,  2 n 


in  the  model  Y ■ XB  e,  the  n x p matrix  X ■ (x,  J of  known  constant 
coefficients  takes  the  form  ^ 


4 ScheffB,  H.,  The  Analysis  of  Variance,  John  Wiley  B Sons,  Inc., 
New  York,  195?"^ 


1 xj  . . . 

1 Xj  Xj  ... 


1 X x^ 

1 ’‘n  •••  ^n 


The  least  squares  estimate  of  8 Is  given  by  8 ■ (X'X)“^X’Y.  If  we 
choose  ■ 8j^.  i » 0,  1,  ...  , p-l,  then  ■ {8^}  is  a set  of  p 

linearly  independent  estimable  functions  which  forms  a basis  for  the 

space  L.  For  any  value  x^^  denote  X^  - (1,  x^,  ...  , xj*^).  Clearly, 

BCy^)  ■ X^8  c L and  has  least  squares  estimate 


x;e  - X^CX'X)"^X'Y  - ^E^a^y^ 

where  the  coefficients  a^  are  the  elements  of  the  1 x n vector 
X^(X'X)‘^X'.  Thus 

E a^  - X^(X'X3“'‘X'[X^(X'X)“‘X']' 
i ^ 1> 

■ X^(X'X)“^X'X(X'X)"^X^J 
- x^cx'X3“^Xg 

so  that  tf?  - o^X'(X'X)’^X„  with  unbiased  estimate  s^X'(X'X)‘^X  . 

0 0 0 0 

Prom  Suhefffl's  theorem  we  can  assert  with  probability  l>a  that 
simultaneously  for  all  ij*  c L and,  in  particular,  X^8  t I 

X'8  - So.  4 X'8  < X'8  + So. 


where  8 • [pF„(p,n-p)]^'^^. 
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As  an  Illustration,  suppose  the  paired  data  (1.20,  0.34), 

(1.37,  0.94),  (1.38,  0.99),  (1.65,  1.58),  (1.71,  2.08),  (1.82,  2.25) 

2 

are  characterized  by  ihe  quadratic  y > -0.31x  * 3.97x  - 3.95  over 
the  interval  o£  interest,  1 £,  x < 2.  The  95%  confidence  region  for 
the  entire  true  line  is  given  by 


X'6  - (5.28)0.  < X'P  < X'8  + (5.28)o. 

1)1  ® 

as  shown  in  Figure  1. 

Grubbs^  showed  that  for  the  case  y ■ 8.  + O.x  the  confidence 

0 1 

bounds  resulting  from  Scheffd's  theorem  are 

1/2 

(2.1) 


8^  + 8^x  ± [2F^(2,n-2)]^^^S 


i i n(x-x)^ 

t*  A 

* *xx 


where  S 


jj'n  ■ - 'l-l)' 


1/2 


2 2 
and  ■ nSXj  - [<:Xj]  . 


Note  that  the  value  x appeari.ng  in  (2.1)  i:.  not  limitod  to  an  x^  which 
appears  in  the  observations  (x^,y^),  i » 1,  2,  ...,  n. 


3.  THE  TWO -SAMPLE  CASE 


Suppose  two  independent  sets  of  data  have  given  rise  to  two 
characterizations  of  the  same  phenomenon  so  that  we  are  now  confronted 
with  what  is,  in  essence,  two  models: 

Yi  « , an  n^  X problem. 


and 


^2  ” ^2^2  ^2  ' *^2  ^2 

We  can  still  represent  this  situation  as  Y ■ XB  ♦ e where  now 


5 Grubbs,  F.  E.,  Linear  Statistical  Regression  and  Functional  Relations, 
BRL  Report  No,  1842,  November  ;.97S. 
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The  least  squares  estimate  is  given,  as  before,  by  0 it  (X'X)  “X'Y. 

Consider  now  tha  difference  of  two  polynomials  yj  * y^  f L with 
LS  estimate  XJ  $j  - XJ  0,  where  XJ  ■ (1,  x^,  ....  x^i"  ).  Rewriting, 


i«  t*  f * 1 I [ ' I ' -1 

XJ  02  “ I “ ^ “r  “ i’^1  1 • "2  ’ 


»2j  i 


X*  CX'X)  X'Y 


n,+n, 

- h -a.y. 
i-1  ^ ^ 


where  the  coefficients  a^  are  the  elements  of  the  1 x (n^-<’n2}  vector 

X’*''(X'xr^X'.  n-.us, 

Sa^^  ■ X*'(X'X)“^X' [X*' (X'X)"^X’]'  " X*' (X'X)’^X'X(X'X)‘^X*  - X*'[X'X)‘^X 


Now  X*  - XJ  I-  X*  and  (X'X) 


(x!x,)"^ 


(XjXj)-‘ 


so  X*'(X'X)‘^X^  - xjcxjxp'^xj  + Xj'cXjXjl'^XI  . 


As  before,  Var(<|»)  has  the  unbiased  estimate  o*  ■ s^X*  (X'X)*^X’*, 


where  is  now  the  pooled  estimate  of  the  variance,  and  with  probability 
1 - a simultaneously  for  all  t|i  e L 


hK  ■ ^2*^2  ■ - y{  - yj  i xj'sj  - xj'bj  + so* 

with  S - [cPi  + P2^*’afPl  * P2‘  "l  * "2  ■ ^Pl  * P2^^]  ^ * 

To  Illustrate  one  of  the  most  useful  potential  applications  of  this 
result  consider  the  situation  where  we  are  presented  with  two  sets  of  data, 

^^1 1 ’^11^  * ^*12*^12^ » • * • » yin)^  U’‘2i'y21^*  ^*22’^22^*’ •''^*2m*W^' 
collected  from  the  same  process;  and  we  want  to  say  something  about  the 
similarity  or  dissimilarity  of  the  two  descriptions.  Suppose  each  set  is 
fitted  with  a quadratic;  and  we  construct  the  confidence  bound  about  the 

* * 

difference  y^  ■ y2*  **  shown  in  Figure  2.  Over  the  region  Cl-65  3,  x i 2.05), 

where  the  confidence  bounds  cover  the  line  y ■ 0,  we  will  say  the  two  des- 
criptions are  consistent,  although  the  associated  probability  level  can- 
not be  attached  without  qualification  and  in** 'irpretation. 

The  extension  of  Grubbs  result  (2.1)  to  this  case  is  direct;  the 
bounds  take  the  form 


XJ  - X§  Bj  * CPi  ^ P2)Fa^Pl  * ^2'  * ^2  ‘ ?1 


■ v]' 


• S • 


1 1 nj(X*-xp  n^CX^-Xj) 


» — + — + 
"i  "2 


f'  1 ? 


2 i 

where  S is  the  pooled  estimate  of  variance  and  is  computed  from  the 

i-th  data  set. 

4.  THE  k-SAMPLE  CASE 

The  straightforward  generalisation  to  k sets  of  data  proceeds  as 
follows: 


J_:2J  • • • 


r'nMTwr;pEtKs,xweMfc4iw^--Aa«KiHi'»;«JK.tfftVfiBSm^ 


and 


Now  for 


' f ' I 

X*  - [XJ  ,.  . . 


» « 
A 

f • 

— T — 
®2 

(XJX2)'^XJY2 

e 

• 

■ 

1 

A 

®k 

t*k*k)'‘*k''k 

1 »' 
1 ’"‘k- 

where  XJ 

■ 1,  X,  , . 

Pi 


-) 


W6  can  write  yj  ■ XJ  ■ X|  (X[Xj^)  t 1 ■ 1,  2,  . . . , k. 

Suppose  we  now  consider  the  estimate  of  £Cj|^yj|: 

Ec^yJ  ■ Ec^Xj'b^  " Ec^X*'(X[X^)“^X|Y^.  Rewriting, 


ec^xj'bj 


cix; 


(X'X)'^X'Y  - E a.y, 
i-l  ^ ^ 


ni+.  . .+nj^ 


where  the  coefficients  a^^  are  the  elements  of  the  1 x En^  vector 
CX*'(X'X)'^X'. 


Thus, 


EaJ  ■ CX*'(X'X)‘^X'tCX*'(X»X)'^X']' 

• CX*'  (X'X)'^X'X(X'Xr^CX* 

- CX*' (X'X)'^CX*. 


CX*  - [CjXJ  I • • • I j “ 


«o  ex'**  CX'X)'^CX*  ■ £ c?XJ*  (XjX. )'^Xf.  Th®  confid®nc®  rtgion  now  Msumes 

1,1  i 1 11  1 


the  form 


IciXj'Bi  - s[aCX*'(X'X)”^CX*  < £CiXJ*Bi  + S sCX*'(X'X)‘ 

with  S ■ |lpi*P^(IPl,£ni  - Epi) 


For  the  linear  case  we  obtain 


CX«'CX'X)’^CX*  ■ E cJx}*CX]X.)'^X? 

i.l  i 1 11  1 


and  the  two  sample  case  (Section  9)  is  obtained  hy  setting  Ci  ■ 1 and 


Cj  ■ -1  • 


•“'aiTnnsif'ipfc 


U 
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most  of  the  papers  presented  at  that  meeting.  These  treat  various  Army 
statistical  and  deslan  oroblems. 
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management  of  reliability 
pharmacokinetic  models 
scrum  lipids 
error-time  analysis 
reliability 
confidence  bounds 
time  series  analysis 
pairwise  contrasts 
stress-strength  models 
probability  distributions 
fatigue  life 
variance  estimation 


robust  outlier  detection 
random  number  generator 
eigenvectors  analysis 
Markov  chains 
continuous  sampling  plans 
Stein's  estimator 
robust  statistical  procedures 
small  samples 
estimation  and  prediction 
exponential  selection  problems 
maximum  likelihood  estimation 
confidence  bounds 
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Subject:  Errata  Sheet 


TO:  Recipients  of  Proceedings  of  the  22nd  Conference  on  the  Design  of 

Experiments  in  Army  Research,  Development  and  Testing 


The  undersigned  apollglzes  for  some  errors  In  my  paper  "Induction  on 
a Markov  chain"  appearing  on  pages  177-186  of  the  proceedings.  Four  pen 
and  ink  corrections  will  correct  these  errors: 

a^  The  denominator  in  Equation  (5)  should  be  1 r <12*^2  ^^^her  than 
1 - q^q^. 

b.  Equation  (7)  should  be  P(S2)  - P(S2)  rather  than  P(S1)  " P(S2). 

c.  The  numerator  of  the  second  term  In  Equation  (11),  following  the 
summation  sign,  should  be 

(^)p\^  ^(k-1)  rather  than 
1 11 

k 1 k-1 


d.  The  phrase  between  Equation  (27)  and  Equation  (28)  should  be: 
"If  ko+1  £.N/2  the  above  generalizes  to" 
rather  than 

"If  ko+1  ■ N/2  the  above  generalizes  to". 


RICHARD  H.  BRUCGER 
Mathematical  Statistician 
Quality  Evaluation  Division 
Product  Assurance  Directorate 
US  Army  Amament  Materiel  Readiness  Command 
Rock  Island,  IL  61201 
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